CPU 설계는 다음의 분야에 초점을 맞춘다:
- datapaths (such as ALUs and pipelines)
- control unit: logic which controls the datapaths
- Memory components such as register files, caches
- Clock circuitry such as clock drivers, PLLs, clock distribution networks
- Pad transceiver circuitry
- Logic gate cell library which is used to implement the logic
고성능 CPU 시장은 주파수(클록수), 전력 소실, 칩 공간 등의 조건을 구현하기 위해 위의 항목에 대한 맞춤형 설계를 요한다.
저성능 CPU 시장은 위의 구현 부담을 다음을 통해 줄일 수 있다:
- Acquiring some of these items by purchasing them as intellectual property
- Use control logic implementation techniques (logic synthesis using CAD tools) to implement the other components - datapaths, register files, clocks
CPU 설계에 주로 이용되는 로직스타일은 다음과 같다:
- Unstructured random logic
- Finite-state machines
- Microprogramming (common from 1965 to 1985)
- Programmable logic array (common in the 1980s, no longer common)
로직을 수행하는 장치의 종류는 다음과 같다:
- 트랜지스터-트랜지스터 로직 Small Scale Integration logic chips - no longer used for CPUs
- 프로그래머블 어레이 로직(Programmable Array Logic, PAL) and Programmable logic devices - no longer used for CPUs
- Emitter-coupled logic (ECL) 게이트 어레이 - no longer common
- CMOS 게이트 어레이 - no longer used for CPUs
- CMOS ASICs - what's commonly used today,[언제?] they're so common that the term ASIC is not used for CPUs
- 현장 프로그래머블 게이트 어레이 - common for soft microprocessors, and more or less required for reconfigurable computing
하나의 CPU 설계 프로젝트에는 다음의 주요 과정들이 필요하다:
- Programmer-visible instruction set architecture, which can be implemented by a variety of microarchitectures
- Architectural study and performance modeling in ANSI C/C++ or SystemC틀:Huh
- High-level synthesis (HLS) or register transfer level (RTL, e.g. logic) implementation
- RTL verification
- Circuit design of speed critical components (caches, registers, ALUs)
- Logic synthesis or logic-gate-level design
- Timing analysis to confirm that all logic and circuits will run at the specified operating frequency
- Physical design including floorplanning, place and route of logic gates
- Checking that RTL, gate-level, transistor-level and physical-level representations are equivalent
- Checks for signal integrity, chip manufacturability
더욱 작은 다이 공간에 CPU 코어를 재설계하는 것은 다음의 목표들 중 몇개를 이루는데 도움이 된다.
- Shrinking everything (a "photomask shrink"), resulting in the same number of transistors on a smaller die, improves performance (smaller transistors switch faster), reduces power (smaller wires have less parasitic capacitance) and reduces cost (more CPUs fit on the same wafer of silicon).
- Releasing a CPU on the same size die, but with a smaller CPU core, keeps the cost about the same but allows higher levels of integration within one VLSI chip (additional cache, multiple CPUs, or other components), improving performance and reducing overall system cost.
As with most complex electronic designs, the logic verification effort (proving that the design does not have bugs) now dominates the project schedule of a CPU.
성능 분석 및 벤치마킹편집
Because there are too many programs to test a CPU's speed on all of them, benchmarks were developed. The most famous benchmarks are the SPECint and SPECfp benchmarks developed by Standard Performance Evaluation Corporation and the ConsumerMark benchmark developed by the Embedded Microprocessor Benchmark Consortium EEMBC.
주요 측정 분야:
- Instructions per second - Most consumers pick a computer architecture (normally Intel IA32 architecture) to be able to run a large base of pre-existing pre-compiled software. Being relatively uninformed on computer benchmarks, some of them pick a particular CPU based on operating frequency (see Megahertz Myth).
- FLOPS - The number of floating point operations per second is often important in selecting computers for scientific computations.
- Performance per watt - System designers building parallel computers, such as Google, pick CPUs based on their speed per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself. 
- Some system designers building parallel computers pick CPUs based on the speed per dollar.
- System designers building real-time computing systems want to guarantee worst-case response. That is easier to do when the CPU has low interrupt latency and when it has deterministic response. (DSP)
- Computer programmers who program directly in assembly language want a CPU to support a full featured instruction set.
- Low power - For systems with limited power sources (e.g. solar, batteries, human power).
- Small size or low weight - for portable embedded systems, systems for spacecraft.
- Environmental impact - Minimizing environmental impact of computers during manufacturing and recycling as well during use. Reducing waste, reducing hazardous materials. (see Green computing).
Some of these measures conflict. In particular, many design techniques that make a CPU run faster make the "performance per watt", "performance per dollar", and "deterministic response" much worse, and vice versa.
|이 문서는 위키백과의 편집 지침에 맞춰 다듬어야 합니다.|
Developing new, high-end CPUs is a very costly proposition. Both the logical complexity (needing very large logic design and logic verification teams and simulation farms with perhaps thousands of computers) and the high operating frequencies (needing large circuit design teams and access to the state-of-the-art fabrication process) account for the high cost of design for this type of chip. The design cost of a high-end CPU will be on the order of US $100 million. Since the design of such high-end chips nominally takes about five years to complete, to stay competitive a company has to fund at least two of these large design teams to release products at the rate of 2.5 years per product generation.
As an example, the typical loaded cost for one computer engineer is often quoted to be $250,000 US dollars/year. This includes salary, benefits, CAD tools, computers, office space rent, etc. Assuming that 100 engineers are needed to design a CPU and the project takes 4 years.
Total cost = $250,000 / Engineer-Man/Year x 100 engineers x 4 years = $100,000,000 USD.
The above amount is just an example. The design teams for modern day general purpose CPUs have several hundred team members.
There are several different markets in which CPUs are used. Since each of these markets differ in their requirements for CPUs, the devices designed for one market are in most cases inappropriate for the other markets.
일반적 용도의 컴퓨팅편집
The vast majority of revenues generated from CPU sales is for general purpose computing[출처 필요], that is, desktop, laptop, and server computers commonly used in businesses and homes. In this market, the Intel IA-32 architecture dominates, with its rivals PowerPC and SPARC maintaining much smaller customer bases. Yearly, hundreds of millions of IA-32 architecture CPUs are used by this market. A growing percentage of these processors are for mobile implementations such as netbooks and laptops.
Since these devices are used to run countless different types of programs, these CPU designs are not specifically targeted at one type of application or one function. The demands of being able to run a wide range of programs efficiently has made these CPU designs among the more advanced technically, along with some disadvantages of being relatively costly, and having high power consumption.
하이엔드 프로세서 시장편집
In 1984, most high-performance CPUs required four to five years to develop.
Scientific computing is a much smaller niche market (in revenue and units shipped). It is used in government research labs and universities. Before 1990, CPU design was often done for this market, but mass market CPUs organized into large clusters have proven to be more affordable. The main remaining area of active hardware design and research for scientific computing is for high-speed data transmission systems to connect mass market CPUs.
As measured by units shipped, most CPUs are embedded in other machinery, such as telephones, clocks, appliances, vehicles, and infrastructure. Embedded processors sell in the volume of many billions of units per year, however, mostly at much lower price points than that of the general purpose processors.
These single-function devices differ from the more familiar general-purpose CPUs in several ways:
- Low cost is of utmost importance.
- It is important to maintain a low power dissipation as embedded devices often have a limited battery life and it is often impractical to include cooling fans.
- To give lower system cost, peripherals are integrated with the processor on the same silicon chip.
- Keeping peripherals on-chip also reduces power consumption as external GPIO ports typically require buffering so that they can source or sink the relatively high current loads that are required to maintain a strong signal outside of the chip.
- Many embedded applications have a limited amount of physical space for circuitry; keeping peripherals on-chip will reduce the space required for the circuit board.
- The program and data memories are often integrated on the same chip. When the only allowed program memory is ROM, the device is known as a microcontroller.
- For many embedded applications, interrupt latency will be more critical than in some general-purpose processors.
임베디드 프로세서 시장편집
The embedded CPU family with the largest number of total units shipped is the 8051, averaging nearly a billion units per year. The 8051 is widely used because it is very inexpensive. The design time is now roughly zero, because it is widely available as commercial intellectual property. It is now often embedded as a small part of a larger system on a chip. The silicon cost of an 8051 is now as low as US$0.001, because some implementations use as few as 2,200 logic gates and take 0.0127 square millimeters of silicon.
As of 2009, more CPUs are produced using the ARM architecture instruction set than any other 32-bit instruction set. The ARM architecture and the first ARM chip were designed in about one and a half years and 5 human years of work time.
연구 & 교육용 CPU 설계편집
The 32 bit Berkeley RISC I and RISC II architecture and the first chips were mostly designed by a series of students as part of a four quarter sequence of graduate courses. This design became the basis of the commercial SPARC processor design.
For about a decade, every student taking the 6.004 class at MIT was part of a team—each team had one semester to design and build a simple 8 bit CPU out of 7400 series integrated circuits. One team of 4 students designed and built a simple 32 bit CPU during that semester. 
Some undergraduate courses require a team of 2 to 5 students to design, implement, and test a simple CPU in a FPGA in a single 15 week semester. 
소프트 마이크로프로세서 코어편집
For embedded systems, the highest performance levels are often not needed or desired due to the power consumption requirements. This allows for the use of processors which can be totally implemented by logic synthesis techniques. These synthesized processors can be implemented in a much shorter amount of time, giving quicker time-to-market.
- 영어판 Wikipedia - CPU design
- Kerr, Justin. "AMD Loses Market Share as Mobile CPU Sales Outsell Desktop for the First Time." Maximum PC. Published 2010-10-26.
- "New system manages hundreds of transactions per second" article by Robert Horst and Sandra Metz, of Tandem Computers Inc., "Electronics" magazine, 1984 April 19: "While most high-performance CPUs require four to five years to develop, The NonStop TXP processor took just 2+1/2 years -- six months to develop a complete written specification, one year to construct a working prototype, and another year to reach volume production."
- Square millimeters per 8051, 0.013 in 45nm line-widths; see
- To figure dollars per square millimeter, see , and note that an SOC component has no pin or packaging costs.
- "ARM Cores Climb Into 3G Territory" by Mark Hachman, 2002.
- "The Two Percent Solution" by Jim Turley 2002.
- "ARM's way" 1998
- "Why the Propeller Works" by Chip Gracey
- "Interview with William Mensch"
- 'Design and Implementation of RISC I' - original journal article by C.E. Sequin and D.A.Patterson
- "the VHS"
- "Teaching Computer Design with FPGAs" by Jan Gray
| 위키책에 이 문서와
관련된 문서가 있습니다.
- 중앙 처리 장치
- History of general purpose CPUs
- 무어의 법칙
- 암달의 법칙
- 단일 칩 시스템
- RISC(Reduced instruction set computer)
- CISC(Complex instruction set computer
- MISC(Minimal instruction set computer)
- 전자 설계 자동화(Electronic Design Automation)
- High-level synthesis