2010년 2월 14일 일요일

Partitioning an ASIC Design into Multiple FPGAs

CMP - United Business MediaProgrammable Logic DesignLine

February 10, 2010

By Juergen Jaeger, Synopys Inc.

Most of today's system-on-chip (SoC) designs rely on field-programmable gate arrays (FPGAs) as a way to accelerate verification, start software development early and validate the whole system before committing to silicon. The FPGA may be an intermediate or, because tough economic realities cannot justify $1M+ in non-recurring engineering charges for an ASIC, initial implementation platform for the SoC design.

Today's FPGAs are large enough to contain a complex system-level design. It's practical, however, for these designs to be partitioned among several FPGAs for various reasons. For example, you may invariably need external components in your system. Also, using several smaller devices can enable a more cost-effective solution than using one big FPGA.

But, integrating your design into several FPGAs can create interesting partitioning problems, especially for larger and/or highly connected designs.

What are the Major Partitioning Considerations?

The most obvious problem for any design is the answer to the question: Will it fit into your FPGA prototype? If you have a very small design, you may fit everything onto a single, large FPGA and you technically won't have a real partitioning problem.

Figure 1: ASIC design start sizes

A note of caution: even though you may think that your design fits based on the ASIC gate count, your design may still need to be partitioned because of the resources available on the target FPGA. Memory or DSP-intensive ASICs frequently fall into this category of design.

Based on the current design sizes as shown in Figure 1, one-third to one-half of ASIC designs will fit into one of today's large FPGAs. Assuming that you have a bigger design and partitioning is required, you need to carefully estimate the number of FPGAs required in your prototyping hardware.

When you must partition, the three big concerns to keep in mind are:

  1. Which blocks need to fit into which FPGA so that you do not exceed the capacity or other resources of the FPGAs in your hardware-prototyping system?
  2. How do you interconnect the FPGAs? Most ASIC designs will exceed the number of available I/Os in the FPGA. Pin availability is further compounded by trying to meet timing.
  3. Finally, ASIC designs often include elements that need to be converted to an appropriate form for an FPGA implementation, such as ASIC memories or gated-clock tree structures.

Compounding these concerns is the decision of whether you want to partition the design at RTL or later in the design phase, such as at the gate level. Both approaches have advantages and disadvantages and are also, in part, dependent on the rest of your design flow.

Partitioning at What Level?

One of the first questions to ask when you partition is whether you want to partition at a netlist-level abstraction or a higher level of abstraction. There are advantages and disadvantages to both. The primary advantage of partitioning at the netlist level is that the area estimations are easier to calculate and more accurate since your design is largely implemented with the optimizations already taken into account. So, it is easier to accurately estimate the gate count and know how much room you will need. Conversely, dealing with a very large database may make partitioning difficult to perform. For example, a design with 90,000 lines of RTL may have a file size of 3.5 Mbytes, while the same design in netlist format may be 6.4 Mbytes in size. The difference could be significantly greater if there is a significant amount of instantiated blocks in the RTL; a 4x difference would not be unusual. In addition, for debugging purposes, it is very difficult to discern from the large, flat netlist database what the problems are in the original RTL. Many things will have changed as the design went from an RTL implementation to a netlist implementation, including netlist names.

Manual Partitioning

Historically, many designs were partitioned manually. There are two major reasons for this. First, it may simply be standard procedure, the way things were always done. This is often the case when the new design has only minor changes from the old design, and partitioning may not be that difficult. The second reason for manual partitioning may be a lack of budget or time to invest in adding a new tool to a flow that already works. It is not uncommon to see designs partitioned in Microsoft Excel' with mapping to see the pin and trace assignments.

If you decide to partition manually, keep these factors in mind:

  1. You need to perform a chip floor plan to ensure that the partitions revolve around the design's bus structures and data paths. In other words, you need to make sure that timing-critical modules are kept together and in close proximity.
  2. You need to ensure that extra logic does not need to be placed next to multiplex signals when there are not enough I/Os between FPGAs. You also must have intimate knowledge of the FPGAs that you will be using to ensure that the correct resources are available for your partition to function as expected.
  3. Finally, you need to perform gated-clock conversions, such as the conversion shown in Figure 2, to ensure that your ASIC design uses the primary FPGA clocks without introducing skews or timing issues.

Figure 2: Gated clock conversion

For the most minor of changes to an existing design, the completely manual approach may be feasible, especially if done by an expert designer at the RTL level. But, for anything more complex, the manual option is not really "manual". Manual gated-clock conversion for a multi-million gate ASIC is impractical and is usually done through a script or user-written program. Similarly, the logic to multiplex signals between the FPGAs is also done through a user-written program. It would be very difficult to keep all the variables in mind without also writing error-checking routines to ensure that you have used the proper number of resources on the FPGA. It is also doubtful that floor planning can be done manually at the netlist level because of the database sizes and the fact that the design is already near a final implementation phase.

Remember, too, that the cost of using the manual partitioning approach is not just the cost of doing things manually, but also the cost of writing and maintaining the conversion and checking programs that need to be written to make the partitioning of a large ASIC feasible. For large designs, except for companies with large internal CAD teams, it is quite likely that the cost in terms of time and support will be prohibitive.

Automatic Partitioning

A small number of vendors offer automatic partitioners. Any automatic partitioning tool should offer the following features:

  1. The ability to consider and optimize for both area and pin requirements to reach a viable solution.
  2. The ability to run at both an RTL and a netlist level of abstraction (An existing design requiring minor changes might be done at the netlist level, while a new design might be at the RTL level.)
  3. Quick area-estimation capability to give you a general feel and confidence that your design will fit onto your prototyping board.
  4. Understanding of the functionality and capabilities of your prototyping board. For example, if your prototyping board has dedicated high-speed clock lines, the partitioning tool's ability to recognize and utilize this functionality greatly enhances your ability to meet system timing.
  5. The ability to set threshold levels to control the amount of logic allocated to each FPGA. You want to be able to set the lowest threshold possible because a near-capacity FPGA may lead to long place and route times. Conversely, you may want to set a high threshold if you have a very large design and have a limited number of FPGAs on your prototyping board.
  6. The ability to have your automatic partitioning tool run on a board design with undefined traces. This allows you to determine if it is feasible to fit a design within the intended pin and area constraints before investing in the purchase or development of a board. Running on a board design with predefined traces produces a comprehensive signal-to-trace assignment report for detailed analysis.

Even with the features described above, it is unlikely that you will be able to fully automate the partitioning of a design without some prior knowledge or intervention. There are two major stumbling blocks to automation that most current tools do not address: (a) the ability to handle black box IP and (b) the ability to optimally utilize all the features within the FPGA such as block RAMs and DSP blocks.

The most efficient route to partitioning is to have a partitioning tool that has both manual and automatic features, such as the one shown in figure 3. Using this interactive flow, you can partition some sections of the design automatically while partitioning other sections manually, especially if you have design-specific information in mind.

Figure 3: Interactive partitioning

Integration Considerations

Up to this point, partitioning has been presented as an individual task in the verification flow. To successfully verify a design, a partitioning tool needs to easily fit into your team's existing flow. While a GUI is important, it is important that your tool be able to support TCL commands and create scripts as many ASIC designers script most of their designs. Support for standards like Synopsys Design Constraints (SDC) is also important. SDC support allows you to use your existing ASIC constraints to drive the partitioning process and ensures that your prototype system complies with other design requirements.

Debugging is a necessary step in ASIC verification. Your partitioning tool must be able to define probe points to allow internal signals to be monitored as part of the I/O interface. Integration with the most common FPGA debugging tools such as Xilinx's Chipscope, Altera's SignalTap and Synopsys' Identify greatly help the debugging effort and allow users to use the tools they are most familiar with. <.p>

Another key capability of a partitioning tool is that it supports the FPGA devices on the prototyping board. This may seem obvious but is often overlooked. While most commercially available partitioning tools support the largest Altera and Xilinx FPGA devices, some partitioners may not support legacy FPGAs or devices from other vendors. This additional FPGA device support gives you greater flexibility to build or buy FPGA hardware prototyping systems from vendors offering the best functionality or price for your needs.

Sound Partitioning Decisions

There are many ways to partition a design to fit into your FPGA prototyping hardware, and partitioning can be done at various phases and at different levels in the design process. With the exception of the most basic designs, a tool that offers both manual and automatic partitioning functionality can play an important role in your overall success. Equally important, this tool must be able to integrate into the overall design flow.

Yes, partitioning a large ASIC design into multiple FPGAs can be challenging. Doing some upfront planning and selecting the right tool flow can make it a lot easier, assure success and achieve the desired result: a thoroughly verified ASIC and first-silicon success.

All materials on this site Copyright © 2010 TechInsights, a Division of United Business Media LLC All rights reserved.

==========

출처: http://www.pldesignline.com/222700643

Digital camera differentiates itself by adding a second display

Performing a Tear Down on the Samsung TL225 digital camera revealed how they kept the BOM under control while expanding on the features.

By Richard Nass

Embedded.com (12/10/09, 09:47:00 AM EST)

It must be difficult for digital still camera vendors to differentiate their products from those of competitors, at least in the eyes of consumers. They can compete on resolution, battery life, image quality, etc. But those features are hard for the consumer to visualize, at least while in the store making a purchase. When you can come up with something that's truly different, then you have something you can really sink your marketing teeth into.

That's what the designers at Samsung have come up with—a digital still camera with a truly differentiating feature. The TL225, which happens to be the object of my current Tear Down, is built with a secondary display. It has the usual 3.5-in. display on the back. But the key is that it has a secondary display, measuring 1.5-in., on the front side of the camera.

The Samsung TL225 digital camera offers a differentiating feature—a secondary display on the front side of the camera.

If you're a 40-something like me, you may question why there's a display on the front of the camera. But show that camera to one of your kids like I did, and they know exactly what it's for—to take a picture of yourself or you and your buddies together.

The key for Samsung was to not raise the BOM much beyond what's required for a single-display camera. And they seem to have achieved that. Taking the camera apart showed that there are two key ICs on the board in addition to the memory.

Samsung was able to keep the BOM to a minimum by keeping the number of components to a minimum.

The Coach 10 device, from Zoran, drives the main display and also handles all the data conversion for the secondary display. In essence, the IC is connected to the image sensor on one side and the LCD on the other side. In between is the interface to the flash memory. The part corrects for image stabilization, lighting, and barrel effect, both in still mode and high-definition video mode.

Zoran claims to offer more than just the silicon. They provide many of the necessary algorithms, and even a reference platform that's pretty close to everything an OEM needs to go to market with a finished product. The Coach 10 also appears in Cisco's Flip UltraHD digital camcorder, which we took apart a few months ago.Note that Zoran has since released the next two devices in the Coach family, the 11 and 12. Those parts add features like face tracking, blur correction, noise reduction, and real-time lens distortion compensation.

The second part on the board is an Igloo AGL060 device from Actel, measuring 6 mm on a side. It's a flash-based FPGA that consumes very little power, operating down to 1.2 V. This particular part contains 60,000 gates and 96 user I/Os.

The Igloo FPGA is responsible for two key functions. One is to manage the interface between the Zoran part and the memory. And the second is to handle the interface to the secondary display. Hence, it's responsible for the LCD timing control and video downscaling.

The Igloo probably could have reduced some of the processing burden in the Zoran processor, had the Samsung designers chosen to do that. While that may have allowed for a slightly less powerful main processor, it would have required a lager die for the FPGA. That's an architectural decision the system engineer has to make. But the guess here is that they likely could have reduced both the bill-of-materials (BOM) and the power consumption slightly.

One of the nice features of the Igloo is that it can operate as either the master or the slave for power control. With a feature called Flash Freeze, the device goes into a very low power mode, around 10 μW. In this state, even though there's no logic toggling, I/Os can still be receiving data. But there's no power being consumed at the I/O or core level. Because the FPGA is flash-based, the value of the registers (or the memory itself) is not lost. Externally, there's no need to switch off the power, or gate or turn off the clock.

The software development for the camera was a designed mostly by Samsung, with drivers coming from Zoran and Actel. That makes the integration and validation a little tricky, because at the end of the day, or the end of the design in this case, all the pieces have to fit together, especially in terms of the timing and I/O assignments. Hence, there's a lot of finger crossing when you get to the validation stage. But in the case of the TL225, it's obvious that they got everything worked out, as the camera shipped on schedule.

On such a system, overall system validation could be difficult, in terms of developing pieces of code, making sure the timing and I/O assignments are accurate. You also have to make sure the footprint is right in terms of having everything fit properly on the board. I know that sounds obvious, but it should not be taken for granted.

Keeping the footprint as small as possible was key to the design.

The system's designers tell me that it worked right the first time, with just a little tweaking required on both the hardware and the software. This was likely because each subsystem was tested individually along the way. That increases the probability of things working correctly when they're all assembled together.

The design time for the TL225 was roughly five months from concept to completion. That's typical for a project like this one. While some of the pieces were new to this design, some IP was borrowed from previous designs, thereby fast-tracking the project somewhat.

==========

출처: http://www.embedded.com/underthehood/222001462

2010년 2월 1일 월요일

맥에서 윈도우 쓰는 제일 쉬운 방법

SCI-FOCUS

[ 제 904 호 ] 2009-04-20

정보통 씨는 예쁘고 세련된 디자인의 애플 컴퓨터에 자꾸만 눈길이 갔다. '기왕 쓰는 컴퓨터, 저렇게 세련된 제품을 쓰면 좋겠지'하는 생각에서다. 그러나 선뜻 애플 노트북을 살 수 없는 이유가 있었다. 정보통 씨가 다니는 회사의 인트라시스템에 접속하거나, 기존의 인터넷 뱅킹을 사용하기 위해서는 윈도우즈 PC를 써야만 하기 때문이다. 회사일을 집에서 처리하거나 은행 업무를 보기 위해서는 계속 윈도우즈 PC를 쓸 수밖에 없는 상황이었다.

그런데 얼마 전 정보통 씨는 그동안 눈독 들이던 애플의 노트북을 덜컥 사들였다. 최근 정보통 씨는 가상화 소프트웨어 덕분에 애플의 OS X(오에스 텐)에서도 윈도우즈 응용프로그램을 돌릴 수 있다는 사실을 알았던 것이다.

가상화(virtualization)는 컴퓨터에서 컴퓨터 리소스의 추출을 일컫는 광범위한 용어이다. 인터넷 백과사전인 위키피디아(wikipedia)는 가상화를 "물리적인 컴퓨터 리소스의 특징을 다른 시스템, 응용 프로그램, 최종 사용자들이 리소스와 상호 작용하는 방식으로부터 감추는 기술"로 정의하고 있다. 즉, 여러 가지 리소스(서버, 운영체제, 응용 프로그램, 저장장치)를 하나의 리소스처럼 보이게 하거나, 단일 리소스에서 여러 가지 물리적 리소스를 만들어 내는 것을 말한다.

좀 알쏭달쏭하게 들리지만, 실제 가상화의 구현 방식은 간단하다. 정보통 씨가 한 것처럼 한 대의 PC에 여러 가지 운영체제를 복수로 설치하여 동시에 사용하는 것, 이것이 가상화 기술이다. 다른 말로는 '플랫폼 가상화'라고도 불린다.

가상화 기술은 이미 1970년대 메인프레임 시절부터 사용되어 왔다. 에뮬레이션도 가상화의 한 예다. 최근 인텔이나 AMD의 x86 계열 CPU에서 가상화가 본격적으로 지원되면서 가상화 기술은 더욱 붐을 일으키고 있다. 플랫폼 가상화의 개념은 데이터 저장장치나 네트워크 리소스와 같은 특정한 시스템 리소스의 가상화로 확장되었다.

이제 정보통 씨는 사진을 정리하거나, 음악을 들을 때, 또 영화를 볼 때는 애플 노트북에서 기존의 OS X를 사용하다가, 사내 인트라에 접속하거나 인터넷 뱅킹이 필요할 때면 가상화 소프트웨어를 통해 MS 윈도우즈 창을 열어서 사용한다. 리눅스용 프로그램을 사용할 때면, 리눅스도 문제없이 띄울 수 있다. 정보통 씨의 애플 노트북은 한 대의 컴퓨터이지만 마치 여러 대의 PC를 사용하는 것처럼 쓸 수 있게 된 것이다.

가상화는 컴퓨팅 환경에 큰 변화를 일으키고 있다. 미국의 경제전문지 비즈니스위크나 포레스터리서치, 가트너그룹 등은 수년 전부터 가상화를 PC 분야의 가장 중요한 기술로 전망하고 있었다. 실제로도 가상화를 구현하는 소프트웨어 시장은 매년 60% 이상 성장해오고 있다. 가상화가 이처럼 빠르게 퍼지고 있는 것은 비용절감 효과가 크기 때문이다. 특히, 전 세계적인 경기불황으로 모든 기업이 경비절감에 힘쓰고 있는 상황에서 가상화는 더더욱 주목받을 수밖에 없다.

오늘날 기업 업무에서 IT 시스템은 필수적인 장비로 자리 잡았다. 결재는 물론, 기안, 사내 정보교류, 구매 및 입찰, 자산 관리, 재정, 웹 관리 등 대부분의 업무가 IT 시스템을 통해 이루어진다. 결과적으로 대부분의 기업은 전체 직원 수보다 더 많은 업무용 PC와 노트북 컴퓨터를 사용하고 있다. 이들 컴퓨터의 유지관리에 적잖은 비용이 들어가는 건 말할 나위도 없다. 그중에서도 각종 운영체제의 보안패치, 소프트웨어의 업그레이드, 각종 바이러스 및 보안 프로그램의 관리 등에 특히 큰 비용과 인력이 필요하다. 그 때문에 사내 업무용 PC를 효율적으로 관리하기 위해 가상화 기술을 도입하는 회사들이 늘고 있는 것이다.

최근 들어 주목받고 있는 '클라이언트 가상화 컴퓨팅'도 가상화의 일종이다. '클라이언트'는 중앙 서버에 연결해서 사용하는 개개인의 다양한 IT 기기를 뜻한다. PC, 노트북, PDA는 물론이고 아이팟, 휴대전화도 클라이언트가 될 수 있다. 이처럼 기기를 추가로 구입하지 않고 가상화를 통해 기존의 유휴자원 활용도를 높이는 기술이 클라이언트 가상화 기술이다. 기존의 장비를 그대로 활용하기 때문에 비용을 아낄 수 있고, 데이터센터에서 컴퓨터를 개별적으로 관리할 수 있어서 업무 효율성도 높일 수 있다.

이러한 가상화 기술을 도입하면 여러 이점이 있다. 직원들의 책상 위에 있는 PC를 얇은 클라이언트로 교체하게 되면 사무공간이 절약된다. 데이터센터에 위치한 서버 또는 얇은 블레이드 PC가 직원들의 PC를 대신하는 형태이기 때문에 전체적인 시스템의 관리가 한 곳에서 모두 이루어지고 장비 구입 및 설치 비용도 절약된다. 직원들은 인터넷이 연결된 곳이면 어디서든지 자신의 데이터에 접속할 수 있어서 업무효율성이 높아지고, 기업 입장에서는 바이러스 등에 대처하거나 기밀문서 유출 방지 등 각종 관리업무를 쉽게 할 수 있다.

그렇다면 우리 회사는 가상화를 통해 비용을 얼마나 아낄 수 있을까? 가상화에 대한 관심이 커지면서 가상화 도입을 통해 절약될 비용(장비 도입비용, 전기요금, 장소임대비용, 관리비용 등)을 계산해주는 사이트가 등장했다. 이러한 사이트에서 계산해보면, 가상화를 통해 전체 IT 장비의 유지보수 비용이 최대 50%까지 절감되기도 한다.

가상화를 통해 한 대의 컴퓨터에 하나의 운영체제만 설치되는 기존의 비효율적 환경은 크게 개선될 것으로 보인다. 또, 가상화 기술 덕분에 앞으로는 특정 운영체제가 시장을 독점하는 일이 드물어질 것이다. 정보통 씨의 사례처럼 MS 윈도우즈만 사용하던 사람들이 다른 운영체제를 동시에 사용할 수 있게 되었기 때문이다. 가상화는 회사의 비용을 절감해주고 개개인의 컴퓨터 사용을 편리하게 할 뿐만 아니라, 시장 질서까지도 바르게 재편해주는 '효자' 기술인 셈이다.

글 : 이식 박사(한국과학기술정보연구원 책임연구원)

Copyright(c)2006 KISTI All right reserved. 모든 저작권은 한국과학기술정보연구원에 있습니다.

==========
출처: http://scent.ndsl.kr/View.do?seq=4115&meid=1_2&class=100&gotoPage=4 &ordering=ISSUE&type=1&menu_id=104034&SearchText=&SearchGubun=SCENT&SearchYear1=2003 &SearchYear2=2010&onlyBody=FALSE