Here is "PLDWorld.{블로그}"...: Xilinx revisits the embedded-CPU FPGA

Wednesday, April 28, 2010

Nearly a decade ago Xilinx and Altera set a new direction for the FPGA industry, each announcing a high-end FPGA sitting beside a powerful CPUs on one die. Enticed by what had been explosive growth in a networking industry that was in fact using MPUs and high-end FPGAs side by side on their boards, the programmable-logic leaders poured development and marketing dollars into their new flagship ICs, Altera Excalibur and Xilinx Virtex-II Pro.

If this story doesn't sound familiar, it's because the two chips were both doomed to vanish. Within about a year both chips were no longer actively marketed, though you could still buy them. Quiet settled over the scene of the revolution, dust gathered on the engineering notebooks, and both companies silently pledged not to try that again.

Exactly what went wrong is a difficult question. There is always enough blame to go around when an entire product category fails. Certainly the issue was not silicon execution: both the chips were heavily used in the academic community, as the platforms for research that became much of the foundation of today's heterogeneous multicore embedded computing.

Rather, the issues were more practical. By the time they were shipping, Excalibur and Virtex-II Pro were comparatively expensive ways to buy what had become a mature microprocessor. So the significant added cost of the FPGA-based parts was hard to justify for production. There was also the problem of configuration. As any product manager can attest, anything you integrate into a chip is the wrong choice for the next customer you talk to. You have the wrong CPU, or the wrong memory architecture, or not the right peripherals, or not enough or too much FPGA fabric. Finally, and perhaps the most serious problem for both chips, the interface between the CPU and FPGA sides of the die is always problematic. An interface powerful and flexible enough for experienced SoC architects is incomprehensible to traditional FPGA users.

All this notwithstanding, yesterday ARM and Xilinx announced another cut at the challenge: the Extensible Programming Platform (or EPP, if you will allow.) With perhaps a nervous glance over the shoulder to check for the spectre of Virtex-II Pro, the company is positioning this product not as an FPGA with an on-chip CPU, but as a software execution platform that happens to facilitate configurable hardware accelerators and peripherals. The difference may sound like words, but it is more than marketing-program deep.

The EPP is architected somewhat differently from the earlier chips. Like them, it is divided into a processor portion and an FPGA portion. But the EPP's processor side is nearly self-contained, comprising a pair of ARM Cortex-A9MP CPU cores, along with the NEON media engine, the debug core, the recently-released AXI-4 interconnect IP, caches, DRAM controller, and typical peripherals. Xilinx senior vice president of marketing Vincent Ratford pointed out that the CPU side of the chip is sufficiently autonomous that it can boot Linux before the programmable fabric is even configured. The FPGA side will apparently look a lot like a moderate-sized Virtex-6, with fabric, block RAM, probably DSP blocks, and, in some versions, fast SerDes.

The interconnect between the two sides is a more interesting subject. Ratford said that about 2500 signals will cross the boundary between the CPU and FPGA regions. That apparently includes both the high-bandwidth main bus and the peripheral bus of the AXI network. It is not clear just how the multi-layer nature of AXI will be propagated into the FPGA fabric. ARM's multicore coherency bus also will extend into the fabric, according to ARM Physical IP Division executive vice president and general manager Simon Segars. So it should be possible for sophisticated users to implement coherent caches and local memories for accelerators in the FPGA Block RAM.

The chip will use TSMC's 28HPL process, and Xilinx plans to sample at least one version sometime in 2011—a pretty big window. Ratford said there would be several versions of the die with different processor subsystems.

The user design flow will be quite different from the traditional FPGA flow. Ratford said "This product targets the software developers." The concept is that developers—presumably starting with a reference design—would use ARM's RealView Development System to bring up an application in C/C++. Then they would profile the code execution, identify hot spots and critical sequences, and call in the hardware team with behavioral synthesis tools to massage the underperforming C into RTL. From there, the RTL would go into Xilinx's ISE 12 tool chain, eventually becoming a configuration file for the FPGA side of the chip. There are plans to link RVDS and ISE at some critical points to allow debug in both environments at once. Xilinx is also exploring Matlab and Labview as design-origination tools.

So are there enough fundamental differences to predict a better fate for the EPP than overtook the Virtex-II Pro? Some things are indeed profoundly different this time. First, you can put vastly more hardware into a large 28nm die than you could into a big chip ten years ago. That means more performance, a please-almost-everyone selection of peripherals at a decent cost point, room for more capable accelerators, and—desperately important—much more on-chip memory. Second, the ARM architecture is far more ubiquitous today than the PowerPC was then. So even if the big networking vendors are once again unimpressed, many other applications are still available. These two facts should substantially reduce barriers to market acceptance of the new architecture.

Third, EPP will probably be one of the first implementations of Cortex-A9 in 28nm to be available to the general market, not a late-coming and expensive alternative to a two-chip approach. Even though the A9 has been announced for about a year now, many users may find the EPP a very accessible way to get at one. If users see value in the FPGA portion of the die as well, the EPP could look like a good deal. And finally, the EPP is addressed to a very different market than Virtex-II Pro. The earlier chip was aimed at FPGA experts. EPP is addressed to software-dominated design teams in which hardware engineers play a supporting role.

Will it work? There remain two major questions. First, can the kind of software-first methodology Xilinx envisions successfully produce a working SoC with today's tools, or will the design require early engagement by FPGA experts, careful system modeling and parallel hardware and software development? If the latter is the case, much of the advantage of the EPP is lost. Second, can Xilinx hide from designers the complexity of the interface between the CPU and FPGA sides of the die, without obscuring the power of the architecture? Neither software developers nor traditional FPGA users are going to cope successfully with the interface in all its riches. Yet the advantage of the EPP over a commodity microprocessor used with an inexpensive FPGA rests in users' ability to exploit that interface. Only time can answer these two questions.

==========

Here is "PLDWorld.{블로그}"...

2010년 5월 2일 일요일

Xilinx revisits the embedded-CPU FPGA

댓글 없음: