New Year's Wishes


  • May you get a clean bill of health from your dentist, your cardiologist, your gastro-enterologist, your urologist, your proctologist, your podiatrist, your psychiatrist, your plumber and the I.R.S/Income Tax dept.
  • May your hair, your teeth, your face-lift, your abs and your stocks not fall; and may your blood pressure, your triglycerides, your cholesterol, your white blood count and your mortgage interest not rise.
  • May New Year's Eve find you seated around the table, together with your beloved family and cherished friends.
  • May you find the food better, the environment quieter, the cost much cheaper, and the pleasure much more fulfilling than anything else you might ordinarily do that night.
  • May what you see in the mirror delight you, and what others see in you delight them.
  • May someone love you enough to forgive your faults, be blind to your blemishes, and tell the world about your virtues.
  • May the telemarketers wait to make their sales calls until you finish dinner, may the commercials on TV not be louder than the program you have been watching, and may your check book and your budget balance - and include generous amounts for charity.
  • May you remember to say "I love you" at least once a day to your spouse, your child, your parent, your siblings; but not to your secretary, your nurse, your masseuse, your hairdresser or your tennis instructor.
  • And may we live in a world at peace and with the awareness of God's love in every sunset, every flower's unfolding petals, every baby's smile, every lover's kiss, and every wonderful, astonishing, miraculous beat of our heart.

Have a wonderful New Year Ahead :-)

Future Trends!!


How long do you think DVDs have around? 20 years? 10 years? Actually, they have only been around for about seven years, but it seems like they have been around much longer. Many of us can remember life before DVDs with VCDs and though they still exist their popularity has reduced drastically. That can be attributed to how rapidly we can become acclimated to some innovations in electronics technology. I believe there are other electronics technologies, either just getting ready to take off, not widely available yet, or just around the corner, that are going to become adopted just as quickly in the near future.

While we're in the age of Ultra High speed broadband internet services, several technologies just around the corner are going to make them much faster than they are today. The typical download speeds for broadband ranges from 1.5 to 10 megabits per second (mbps) today. Within the next year, speeds of 15-20 mbps will be available to the average consumer. Then, shortly thereafter, speeds of up to 25, 50, 75, and even 100 mbps will be available in some places. In the not-so-distant future, speeds of 25-100 mbps is will be quite common. "Fast TCP", which is currently being tested, has the potential to turbo-charge all forms of currently available broadband internet connections without requiring any infrastructure upgrades. It will better utilize the way in which data is broken down and put back together within traditional internet protocols.

All the major phone companies are currently in the process of replacing their copper wires with high capacity fiber optic lines. Fiber optic lines will greatly increase the amount of bandwidth that can be delivered. Fiber optics will allow phone companies to deliver video, either via a cable TV-type platform or a TV over Internet Protocol (TVIP) platform (see my October 7 column), and faster DSL speeds. The current breakthrough is Triple play by which you have TV, Internet and Phone on the same medium, you can get more info on this by googling around. Eventually, the current internet as we know it will be scrapped and completely replaced with a whole new internet called "Internet 2." This new internet is expected to provide speeds of up to 6000 times faster than current broadband connections!

Another technology item that you've probably heard a lot about recently is digital television. Digital TV uses a different wavelength than traditional analog TV and has a much wider bandwidth. It also has a picture that never gets "snowy" or "fuzzy." If the signal is not strong enough, you get no picture at all, rather than the fuzzy picture you sometimes get with analog. In order to receive digital signals over the airwaves, you must have a digital TV set (one with a digital tuner inside) or an analog TV with a set-top converter. Cable and satellite TV also use digital formats, but unlike broadcaster signals, their non-High Definition digital signals are automatically converted to a format an analog TV can process, so a digital TV or converter is not needed. High Definition Television formats, even on cable to satellite, require a digital TV or a converter (more on High Definition later).

All broadcasters are now doing some broadcasts on their digital channels in addition to their normal broadcasts on their analog channels, but they were originally supposed to completely convert over from analog signals to digital signals by the end of 2006. However, there is an exception that allows them to wait until 85% of the television sets in their market are digital. This could take 10 years or more to happen. Congress and the FCC are now looking at imposing a hard deadline on all broadcasters to convert to digital signals by 2009. Once they all convert to digital signals, their analog channels will taken back by the FCC and used for other purposes like emergency signals.

High Definition Television (HDTV) is one possible use of digital signals. HDTV uses the entire digital bandwidth and is the crystal clear format you've probably seen on TVs in electronics stores. It has no visible lines on the screen. Someone once described it as being like "watching a movie in the theater." Keep in mind that all HDTV is digital, but not all digital is HDTV. Along those same lines, not all digital TVs are HDTVs. Since digital TVs are very expensive and those with HDTV capability are even more expensive, consumers really need to keep this in mind.
The other possible use of digital signals is channel compression, often referred to as "multicasting." Non-HDTV programming does not utilize the entire width of a digital signal. Therefore, it is possible to compress two or more channels of programming into one digital signal. Satellite and cable operators do this all the time with their non-HDTV digital channels, but this process is transparent so many people don't realize it. Many broadcasters plan to use their digital signals this way during times when they are not being used for HDTV programming. For example, some plan to air all news and all weather channels in addition to their regular channels of programming.

TV recording and playback technology is changing as well. DVD recorders, which debuted about four years ago, have now become affordable to the average family. A couple of years ago, they were priced above $1000, but now you can get them for around $25, in many cases. The main sticking point now with DVD recorders is that not all of them will record/play all three of the competing formats: DVD-RAM, DVD-RW, DVD+RW. They will have difficulty gaining wide acceptance from the public until one format is settled on or all recorders can record and play all three formats. Also now we have HD DVD and Blu Ray which boast of enormous storage capabilities.

One the other hand, digital video recorders (DVRs) and personal video recorders (PVRs), just two names for something that is really the same thing, seem to be gaining quickly in popularity. DVRs/PVRs utilize a hard drive to record programs, without the need for discs or tapes. DVRs/PVRs with larger hard drives are becoming available and less expensive all the time. These devices can record one show while you are watching another. They can record more than one show at a time. They allow you to watch the part of a show that has already been recorded while the remainder of that show is still being recorded. They allow for easy scanning, searching, and skipping through recorded programs and even allow you to skip commercials with one touch of a button. They allow you to pause live programs while you answer the door or go to the restroom and then pick up where you left off when you get back. With these devices, recording can be automatic, i.e., you can program them to automatically record every episode of your favorite shows, no matter when they air. You can also have them automatically find and record programs that match your interests. In addition, video can be automatically downloaded to the device via a phone connection. TiVo, the leading brand in the industry, has announced that it will be teaming up with Netflix next year to allow downloading of movies on demand via a broadband internet connection (see my October 7 column for more details).

DVRs/PVRs are becoming so popular that cable and satellite TV providers have begun including them as add-ons to their receivers, either at no extra cost or for a small additional monthly fee. About the only shortcoming of DVRs/PVRs is the fact that they can't play pre-recorded DVDs or tapes, so you would still need your DVD player or VCR if you rent or purchase movies. However, hybrid devices which combine DVRs/PVRs with a DVD player/recorder and/or VCR are now hitting the market. Those devices would not only get rid of that problem but would also give you the option of permanently transferring a recorded show/movie from a hard drive to a recordable DVD.

Flat screen and flat panel TV technology is also starting to boom. Many people are confused about the difference between flat screen TVs and flat panel TVs. Flat screen TVs use the old cathode ray tube (CRT) technology for their picture tubes and are therefore bulky like traditional TV sets. However, they are different from traditional TV sets in that they have a flat screen. They deliver a picture that doesn't have as much glare as traditional, more round screens. Also, the picture will look the same to everyone in the room, no matter where they are sitting. The picture on a traditional screen looks distorted when viewing it from an angle.
Flat panel TVs, on the other hand, utilize either liquid crystal display (LCD) or plasma technology instead of the old CRT technology and are generally just a few inches thick. Many of them can be hung on a wall. In fact, flat panel TVs that are flatter than a credit card will be coming soon! What's the difference between LCD and plasma? LCD is generally used for flat panel TVs with a display of less than 30 inches and usually has a brighter picture and better contrast than plasma. LCD is used for flat panel computer monitors as well. Plasma is generally used for flat panel TVs with a display of more than 30 inches and has a better color range than LCD. Plasma is becoming more common as TVs get bigger and flatter.

Although I'm not so sure about this one, I will include "entertainment PCs" because of their tremendous potential to revolutionize home entertainment. The concept of "entertainment PCs" is being hailed right now by both Microsoft and Intel. In fact, Microsoft has developed a special operating system for them. They could be used as the hub for all home entertainment and could enhance a family's experience of television, radio/music, and internet and actually help to combine all of these into one. They could be used to download content from the internet and play it on a TV. They could provide such sophisticated TV recording interfaces that VCRs, DVDs, and DVRs/PVRs could all eventually become obsolete. In addition, they could be a better source for photograph and home video editing and processing than regular PCs. With that being said, I'm not so sure that people will be willing to accept PCs as a source of home entertainment. Bill Gates begs to differ and is willing to put his money where his mouth is.

Obviously, not all of the cutting edge electronics technologies mentioned above will meet with great success. Some of them might actually go the way of Betamax, digital audio tape (DAT), and DIVX. However, many of them are sure to catch fire and become such an intricate part of our everyday lives that we'll wonder how we ever got along without them. Which ones will they be? Only time will tell.

Gate Level Simulation, Part - II


Gate level simulation is used in the late design phase to increase the level of confidence about a design implementation and to complement verification results created by static methods (formal verification and static timing analysis). In addition to the disadvantages of medium to long run times to simulate comprehensive vector sets on large designs, the coverage of potential functional and timing problems highly depends on the quality of the input stimulus and cannot be guaranteed in a practical way. In some cases, however, a gate level simulation can help to verify dynamic circuit behavior that cannot be accurately verified with static methods. For e.g. the start up and reset phase of a chip. To reduce the overall cycle time, only a minimum amount of vectors should be simulated using the most accurate timing model available (parasitics extracted from post-layout database).

Unit Delay Simulation:
The netlist after synthesis, but before routing does not yet contain the clock tree. It does not make sense to use SDF backannotation at this step, but gatelevel simulation may be used to verify the reset circuit, the scan chain or to get data for power estimation. If no backannotation is used, simulators should use Libraries which have the specified block containing timing args disabled and using distributed delays instead. The default delay for a storage element at 10 ps, for a combinatorial gate 1 ps and a clock gating cell 0 is the most secure possibility to run unit delay simulation, and process size and performance are optimized if the specify block is disabled.

Full Timing Simulation (With SDF):
Simulation is run by taking full timing delays from sdf. The SDF file is used to back annotate values for propagation delays and timing checks to the Ver-ilog gate level netlist.

Comments are greatly appreciated.

Glossary of EDA Terms


Verilog rules that can save your breath !


This article contains some thoughts of mine about how and engineer should write Verilog code for Synthesis, general rules that will save you headaches if you follow, and how a Verilog file should be layed out.
Rules:
  • If you don't know what hardware the code you just wrote is, neither will the synthesizer.
  • Remember that Verilog is a Hardware Description Language (HDL) and as such it describes hardware not magical circuits that you can never actually build.
  • You should be able to draw a schematic for everything that you can write Verilog for.
  • Be sure to know what part of your circuit is combinational and which parts are sequential elements. If you do not know or the code is written to be too hard to figure this out, the synthesizer will probably not be able to figure it out either. I recomend making the combinational logic very separate from sequential logic. This prevents errors later. It also prevents level high latches from being synthesized where you meant to have flip-flops. I also recomend having a naming convention such that you can tell what is a state holding element at all times. I use "_f" post-pended to all registers that are flip-flops.
  • I recomend having a style for your inputs and outputs. I list them in the following order: outputs, inouts, special inputs such as clk and reset, inputs.
  • When instantiating a module, always put the names of the signals that you are conecting to inside of the module with the notations where you have period, module signal name, thing you are connecting. This prevents errors when you change underlying modules or someone resorts the parameters.
  • Unlike a language like C which is rather strongly typed, in Verilog, which is also strongly typed, everyting is of the same type and it is easy to reorder parameters and not get errors becasue everything is just a wire.
  • Example of wrong module instantiation: nand2 my_nand(C, A, B);
  • Example of correct module intantiation: nand2 my_nand(.out(C), .in1(A), .in2(B));
  • Make your circuit synchronous whenever possible. Synchronous design is much easier than asynchronous.
  • Also reduce the number of clock domains and clock boundaries whenever possible.
  • Also remember that crossing clock domains in FPGAs is difficult because LUT's glitch in different ways than normal circuits. This causes problems with asynchronous circuits.

Verilog files should be laid out like this..

  • define consts
  • declare outputs (these are _out)
  • declare inouts if any (these are _inout)
  • declare special inputs such as clk and reset
  • declare inputs (these are _in)
  • declare _all_ state (these are _f)
  • declare inputs to state with (these have same name as state but are _temp)
  • declare wires (naming not restricted except connections of two instantiated modules are usually of the form A_to_B_connection)
  • declare 'wire regs' (naming not restricted except connections are usually of the form A_to_B_connection and variables that are going to be outputs, but still need to be read and you don't want inouts get _internal postpended)
  • do assigns
  • do assigns to outputs
  • instantiations of other modules
  • combinational logic always @'s are next
    do the always @ (posedge clk ...) and put reset values here and assign _temps to _f's (ie state <= next_state). I personally think that there should be no conbinational logic inside of always @(posedge clk's) with the exception of conditional assignment (write enables) becasue all Verilog synthesizers understand write enables on flip-flops.

VHDL Online


http://esd.cs.ucr.edu/labs/tutorial/VHDL_Page.html

  • Books
  • Tutorials
  • Examples
  • Tools
  • Download
  • Others

VHDL Tutorial: Learn by Example

  • Basic Logic Gates
  • Combinational Logic Design
  • Typical Combinatinal Logic Components
    Latch and Flip-Flops
  • Sequential Logic Design
  • Typical Sequential Logic Components
http://esd.cs.ucr.edu/labs/tutorial/

Embedded System Design: A Unified Hardware/Software Introduction


Some web resources, references, labs, and slides.
http://esd.cs.ucr.edu/

VLSI Training Institutes


Updated 15 Jan 2011, by Guest Blogger:
Since this article was last published in Nov, 2006 lot of development has happened in the VLSI training space. To understand what is more suited for you and to select the right institution please read further at Career Counseling.
--
Responding to request from a reader "Venkat" regarding VLSI training institutes..

The first question i would ask anyone who is looking forward to joining a VLSI training institute is what exactly they are looking for? I will give links to some article where you can decide, later in the article.

I feel that many institutes teach just bare basics (which you can find on the blogs and websites around the net, or rather your 4 years of Engineering/BE/BS) after gulping huge lumpsums of $$. How many of these institutes target real problems? Or rather which will be helpful when you enter the industry and can say that i targeted so and so problem and solved it the so and so way. Wouldn't that be great. Atleast that you can say as experience.

Some teach assembly programming, verilog, VHDL etc. Which is ok, but can also be done by self learning with little effort. Why spoon feeding?

My answer to this query would be just plain "Dont go anywhere near them!" They are not worth it. Seriously !!

But if you still think that you need to throw away your money... then follow these links below..
http://www.angelfire.com/electronic/in/vlsi/training.html
http://in.geocities.com/srinivasan_v2001/technical/vlsi_training.html
http://www.asic-world.com/verilog/verifaq4.html

Synthesis


Logic synthesis is a process by which an abstract form of desired circuit behavior, typically register transfer level (RTL) or behavioral is turned into a design implementation in terms of logic gates. Some tools can generate bitstreams for programmable logic devices such as PALs or FPGAs, while others target the creation of ASICs. Logic synthesis is one aspect of electronic design automation.

History of Logic Synthesis
The roots of logic synthesis can be traced to the treatment of logic by George Boole (1815 to 1864), in what is now termed Boolean algebra. In 1938, Claude Shannon showed that the two-valued Boolean algebra can describe the operation of switching circuits. In the early days, logic design involved manipulating the truth table representations as Karnaugh maps. The Karnaugh map-based minimization of logic is guided by a set of rules on how entries in the maps can be combined. A human designer can only work with Karnaugh maps containing four to six variables.

The first step toward automation of logic minimization was the introduction of the Quine-McCluskey algorithm that could be implemented on a computer. This exact minimization technique presented the notion of prime implicants and minimum cost covers that would become the cornerstone of two-level minimization. Another area of early research was in state minimization and encoding of finite state machines (FSMs), a task that was the bane of designers. The applications for logic synthesis lay primarily in digital computer design. Hence, IBM and Bell Labs played a pivotal role in the early automation of logic synthesis. The evolution from discrete logic components to programmable logic arrays (PLAs) hastened the need for efficient two-level minimization, since minimizing terms in a two-level representation reduces the area in a PLA.

However, two-level logic circuits are of limited importance in a very-large-scale integration (VLSI) design; most designs use multiple levels of logic. An early system that was used to design multilevel circuits was LSS from IBM. It used local transformations to simplify logic. Work on LSS and the Yorktown Silicon Compiler spurred rapid research progress in logic synthesis in the 1980s. Several universities contributed by making their research available to the public; most notably, MIS [13] from University of California, Berkeley and BOLD from University of Colorado, Boulder. Within a decade, the technology migrated to commercial logic synthesis products offered by electronic design automation companies.

Behavioral synthesis
With the goal of increasing designer productivity, there has been a significant amount of research on synthesis of circuits specified at the behavioral level using a hardware description language (HDL). The goal of behavioral synthesis is to transform a behavioral HDL specification into a register transfer level (RTL) specification, which can be used as input to a gate-level logic synthesis flow. Behavioral optimization decisions are guided by cost functions that are based on the number of hardware resources and states required. These cost functions provide a coarse estimate of the combinational and sequential circuitry required to implement the design.

The tasks of scheduling, resource allocation, and sharing generate the FSM and the datapath of the RTL description of the design. Scheduling assigns operations to points in time, while allocation assigns each operation or variable to a hardware resource. Given a schedule, the allocation operation optimizes the amount of hardware required to implement the design.

Multi Level Logic Minimization
Typical practical implementations of a logic function utilize a multilevel network of logic elements. Starting from an RTL description of a design, the synthesis tool constructs a corresponding multilevel Boolean network.

Next, this network is optimized using several technology-independent techniques before technology-dependent optimizations are performed. The typical cost function during technology-independent optimizations is total literal count of the factored representation of the logic function (which correlates quite well with circuit area).

Finally, technology-dependent optimization transforms the technology-independent circuit into a network of gates in a given technology. The simple cost estimates are replaced by more concrete, implementation-driven estimates during and after technology mapping. Mapping is constrained by factors such as the available gates (logic functions) in the technology library, the drive sizes for each gate, and the delay, power, and area characteristics of each gate.
[edit]

Commercial logic synthesis
Examples of software tools for logic synthesis are Design Compiler from Synopsys and the humorously named BuildGates, from Cadence Design Systems. Both of these target ASICs. Example of FPGA synthesis tools include Synplify from Synplicity, Leonardo and Precision from Mentor Graphics and BlastFPGA from Magma Design Automation.

Comprehensive Verilog Tutorials - Introduction


The history of the Verilog HDL goes back to the 1980s, when Gateway Design Automation developed Verilog-XL logic simulator, and with it a hardware description language.Cadence Design Systems acquired Gateway in 1989, and with it the rights to the language and the simulator. In 1990, Cadence put the language into the public domain, with the intention that it should become a standard, non-proprietary language.

The Verilog HDL is now maintained by a non profit making organisation, Accellera, which was formed from the merger of Open Verilog International (OVI) and VHDL International. OVI had the task of taking the language through the IEEE standardisation procedure. In December 1995 Verilog HDL became IEEE Std. 1364-1995. A revised version was published in 2001: IEEE Std. 1364-2001. This is the current version.

Accellera have also been developing a new standard, SystemVerilog, which extends Verilog. SystemVerilog is also expected to become an IEEE standard.

Comprehensive Verilog Tutorials - Welcome


This is an Introductory & Comprehensive Verilog Course, which covers..

  1. Modeling Designs for Digital Simulation.
  2. Modeling Designs for Synthesis.
  3. Design Verification using Verilog HDL.

To gain the most benefit from this course, you should:

  • Have a background in Electronics Engineering.
  • Digital Components like AND, XOR, MUX, Flip-Flop, etc.
  • Basic Computer Architectures, knowledge of ALUs, State Machines etc.

Good Luck.

Sponsors


Be a sponsor & Support this Blog

Some of our Proud Sponsors:

VLSIChipDesign

Checkout how much a Text-Link is worth from this Blog

Invitation to be a contributor on this blog!


We are happy to invite you as a contributor to this blog in digital electronics.
Of course, you can choose to be anonymous like us or choose an alias for convenience.

Please be advised that we don't tolerate posting of proprietary information specific to any company and that you obey the copyrights. We are not responsible in anyways for any copyright infringements by the contributers. A Metastable State is the sole owner of this blog. Please avoid spam and vulgarity.

A detailed set of rules will be sent to people who are interested. Please get back in touch at "onenanometer (at) gmail (dot) com " with your details.

Added Features!


After much awaited delay due to developments on the blogger in beta, i m happy to announce that i have successfully converted/moved to beta.
This enables me to mainly..
  • categorize posts(labelling is needed)
  • display the latest comments on the sidebar(to be added) etc.
  • add relevant links
  • etc.
I hope you will like it new.
As you have already noticed, i have started a new series in Gate Level Simulation (GLS). This will take time and would appreciate your patience.

Gate level simulation - Introduction


With wide acceptance of STA and Formal verification tools by the industry one question still arises in the minds of many, "Why do we need gate level simulation?"
The common reasons quoted by many engineers are simple..

  1. To check if reset release, initialization sequence and boot-up is proper.
  2. Since Scan insertions occur during and after synthesis, they are not checked by simulations.
  3. STA does not analyze asynchronous interfaces.
  4. Unwarranted usage of wild cards in static timing constraints set false and multi cycle paths where they dont belong. This can also be due to design changes, mis-understanding or typos.
  5. Usage of create_clock instead of using create_generated_clock between clock domains.
  6. For switching factor to estimate power.
  7. X's in RTL sim can be pessimistic or optimistic. Any unintended dependencies on initial conditions can be found through GLS.
  8. Design changes, wrong understanding of the design can lead to incorrect false paths or multicycle paths in the constraints.
  9. Can be used to study the activity factor for power estimation.
  10. It's an excellent feel good quotient that the design has been implemented correctly.

Some design teams use GLS only in a zero-delay, ideal clock mode to check that the design can come out of reset cleanly or that the test structures have been inserted properly. Other teams do fully back annotated simulation as a way to check that the static timing constraints have been set up correctly.

In all cases, getting a gate level simulation up and running is generally accompanied by a series of challenges so frustrating that they precipitate a shower of adjectives as caustic as those typically directed at your most unreliable internet service provider. There are many sources of trouble in gate level simulation. This series will look at examples of problems that can come from your library vendor, problems that come from the design, and problems that can come from synthesis. It will also look at some of the additional challenges that arise when running gate level simulation with back annotated SDF.

RTL considerations and Functional verification of low power designs


This article is about RTL in a Multi-Voltage environment and it's implication on verification.

In the earlier posts i discussed Multi Voltage design infrastructure. Today let's look at 'Power Gating', the most common design style to reduce Leakage Power.

Typical characteristics of this design style are:-

  1. Some of blocks in the design will be shut-down, when not functional.
  2. There will be blocks, which are always on.
  3. These blocks could be of same voltage or different voltage.
  4. The power structure to shut-down a block could be either completely external or internal. Most commonly used is internal power structure to shut-down blocks.
  5. Either VDD or Ground can be cut-off.

Consider a classical scenario, wherein implementation/verification becomes a real challenge.

"We have a chip taped-out, working fine in 65nm. We want to add more functionality to the same chip and want to accommodate the logic within the same die-area as before. To accommodate this silicon real estate requirement, we decided to move to 45nm. Since the application as well as the technology node demands extremely low leakage, we want to shut-down some blocks in the design."

Given this, it's very very challenging to accommodate Power Gating, since this chip is not architected to accommodate Power Gating

Now, given the characteristics of Power Gating Design Style, here are some facts I think, we need to consider while Micro-Architecting the design.

RTL/Micro-Architecture requirements:-

  1. Some of the blocks will be shut-down. Does your design have control logic that generates signals locally to shut-down the block ?
    • I think, if the design was architected from the beginning with power gating design style in mind, it will have a control block, which probably might make decisions on which blocks to shut-down and when and how long this has to be shut-down etc. Now the other question that comes to mind is, Is this sufficient? Do I need a separate power control block, which takes inputs from the control logic and generates the power down signals in the desired sequence ? I think it is a good practice to introduce such logic to control the complete power-down/power-on sequence.
  2. Lets look at the control signals required:
    • Control Signal for the Power Switch (Switch_enable)
    • Control Signal for the Isolation Cell Enable (Isolate_enable)
    • Control Signal for the retention flops (Save_Restore)
      • Now, I think ideally all these control signals are derivative of each other!. Its just that these signals need to be generated in the right order for the circuit to behave as desired.
      • sequence could be:
        • Inactivity generate
          1. Generate Save_Restore: This will indicate that the retention flops needs to transfer the contents from Master Latch to Slave and go into sleep mode.
          2. Generate Isolate_enable: This will enable isolation cells to be active and clamp the output to a known voltage and state.
          3. Since all the basic elements are informed of the shut-down operation, we can now generate Switch_enable, to turn off the power rails, that control specific blocks.
          4. There could be other actions such as reduce the frequency/reduce the voltage….etc as a part of this sequencing.
          5. As a part of this sequence definition, we should define the right Assertions too, so that if the right sequence is violated, this can be flagged up-front.
        • Sequence power_sequence
          1. Save_Restore && Isolate_enable && Switch_enable ==0
          2. ##1 Save_restore ==1
          3. ##1 Save_restore && Isolate_enable == 1;
          4. ##1 Save_restore && Isolate_enable && Switch_enable ==1;
        • endsequence
  3. If the above control signals exist in RTL, these are driven by power management logic but are not connected to anything!
    • Even though as said in bullet 2 these control signals are generated by Control Logic, these are not connected to anything outside this control logic. The reasons are:-
      • Power Switch that's used to cut-off power does not exist in RTL. These get added probably during Power Planning. Floor-planning/Power Planning Engineer will add these based on the specification from the Architect of the chip. Till switches are in place Switch_enable is floating.
      • In RTL there is nothing specifically done for Retention Flop, these are coded like any other register and Synthesis tool will infer them based on some commands. Save_restore end up floating.
      • Isolation cells does not exist in the RTL and hence Isolate_enable is floating.
    • Now the Question arises. How do we simulate them? We see them in order...
      1. Power Switch Behaviour
      2. Isolation Behaviour
      3. Level Shifter Behaviour
      4. Retention flop.
Remember, we don't have any representation of the above cells in the RTL. Firstly do we need to simulate the behavior of all of them? Typically in good old days, PLI's were written by verification teams to simulate all of them.

For example say,
Power Switch Behaviour: We can write a function/pli with following specification
$power(block_to_be_pd, type_of_pd(aon/shut-down)signal_used_for_shut_down(switch_enable), acknowledge signal(acknowledge))

Now this PLI should look at the "type_of_pd", which is either always_on or shut-down and act accordingly. In case the block under consideration is of type shut-down, then whenever it detects an activity on the "switch_enable" signal, it should corrupt all the signals of the block. Once all the signals are corrupted, it should generate an "acknowledge" signal after a user specified delta delay.

In my humble opinion, this should also include something like:
#0 $power(block1)
#20 $power(block2)

This enables us to simulate the behaviour of power sequencing. There is lot more that can be added to this PLI routine such as:
  1. Trace through the fanout of all the outputs of this block. Flag an Error if corrupted signals are propagated till the reciever.
  2. When switch_enable goes inactive, either reset all the logic in the block to "X" or to some random pattern.
  3. During power up, stagger the power up of different blocks randomly!!!
  4. Emulate Impact of IR-drop using staggering principle!
Isolation Behaviour: This is again pretty straight-forward. All we need is a PLI or a simple function in Verilog, which will be something like:

$isolate(input,early_switch_enable,output,output_sense)
If "early_switch_enable" is active, maintain the output at "output_sense(1/0)" value, irrespective of the state of input.

Retention Behaviour:
This is again pretty straight-forward from a simulation perspective. The PLI or a simple function, which will be something like:
$retain
("register_names",early_early_switch_enable,wake_up)
Whenever "early_early_switch_enable" is active, copy the contents of "register_names " onto a local shadow_register or a local memory, and whenever "wake_up" goes active, reload the "register_names" with contents of the shadow_register.

Now there are various complex flavors of all the above depending the circuit behaviour of these speciall cells. The major question to be answered is, are we looking at 2 different RTL? One for synthesis without any PLIs and one for simulation with PLIs. Can these PLIs be synthesized into H/W automatically by all the EDA tools available? Is this the right approach? Can these be solved using a different approach?

More Questions that need answers..
  1. Is a proper sequence for all the control signals defined ? Examples of this could be:
    • Switch_enable @ 5ns
    • Isolate_enable @ Switch_enable "+" 10ns
    • Save_Restore @ Isolate_enable "+" 20ns
  2. Now the block, which we are trying to shut-down needs to generate an Acknowledgment signal, indicating power-up or power-down. This signal is again a floating output not driven by any logic,but is processed by the power management logic!!!
  3. Is there a requirement, such as : Block needs to be powered-up within "n" clock cycles? What if you don't receive an Acknowledge within "n" clock cycles?
  4. If all the above are taken care of during micro-architecting, there are still few more questions that need to be answered for Logic Synthesis and Functional Simulation:
    – Is Isolation Cell/Level Shifter part of your RTL ? How are you coding this ? Are you instantiating it in the RTL?
    – Are Retention Flops part of RTL ? How are you coding this ? Are you instantiating it in RTL ?
    – How will the control signal get interpreted by the implementation tool, as they (the control signals) are floating?
    – How will Acknowledge signal get generated? Since it's required by power management logic, but is not generated by any hardware?
    – How will functionality of all these get verified, given that some of them are either floating or not generated ?
    – How will the shut-down get simulated ? Nothing special is done in RTL to simulate this behaviour.
    – How do we model Shut-Down to verify the functionality?
    – How will the retention flop behaviour get simulated ? In RTL it's coded like any other register.
    – When a block wakes up from shut-down, what should be the status of all the logic? Is random better or using "X" better ? Wouldn't "X" be very pessimistic ?
    – How to simulate the behaviour of "n" clock cycle requirement of the Acknowledge Signal from power-down block ?
    – If there are some always on logic residing in a shut-down block, how do we implement them? How do we verify them ?

Todays Low Power Techniques


Lets take a look at the various low power techniques in use today.
I would classify them into 2 categories

  • Structural Techniques
    • Voltage Islands
    • Multi-threshold devices
    • Multi-oxide devices
    • Minimize capacitance by custom design
    • Power efficient circuits
    • Parallelism in micro-architecture
  • Traditional Techniques
    • Clock gating
    • Power gating
    • Variable frequency
    • Variable voltage supply
    • Variable device threshold
Which one of the above techniques are aimed at reducing Dynamic Power and Leakage Power?

Dynamic Power Reduction
  • Clock Gating
  • Power efficient circuits
  • Variable frequency
  • Variable voltage supply
Leakage Power Reduction
  • Minimize usage of Low Vt Cells
  • Power Gating
  • Back Biasing
  • Reducing Dynamic Power
  • Reduce Oxide Thickness
  • Use FINFET's

Design Elements of Low Power Design


Special cells are required for implementing a Multi-Voltage design.

  1. Level Shifter
  2. Isolation Cell
  3. Enable Level Shifter
  4. Retention Flops
  5. Always ON cells
  6. Power Gating Switches/MTCMOS switch
Level Shifter: Purpose of this cell is to shift the voltage from low to high as well as high to low. Generally buffer type and Latch type level shifters are available. In general H2L LS's are very simple whereas L2H LS's are little complex and are in general larger in size(double height) and have 2 power pins. There are some placement restrictions for L2H level shifter to handle noise levels in the design. Level shifters are typically used to convert signal levels and protect against sneak leakage paths. With great care, level shifters can be avoided in some cases, but this will become less practicable on a wider scale.

Isolation Cell:
These are special cells required at the interface between blocks which are shut-down and always on. They clamp the output node to a known voltage. These cells needs to be placed in an 'always on' region only and the enable signal of the isolation cell needs to be 'always_on'. In a nut-shell, an isolation cell is necessary to isolate floating inputs.
There are 2 types of isolation cells (a) Retain "0″ (b) Retain "1″

Enable Level Shifter: This cell is a combination of a Level Shifter and a Isolation cell.

Retention Flops: These cells are special flops with multiple power supply. They are typically used as a shadow register to retain its value even if the block in which its residing is shut-down. All the paths leading to this register need to be 'always_on' and hence special care must be taken to synthesize/place/route them. In a nut-shell, "When design blocks are switched off for sleep mode, data in all flip-flops contained within the block will be lost. If the designer desires to retain state, retention flip-flops must be used".

The retention flop has the same structure as a standard master-slave flop. However, the retention flop has a balloon latch that is connected to true-Vdd. With the proper series of control signals before sleep, the data in the flop can be written into the balloon latch. Similarly, when the block comes out of sleep, the data can be written back into the flip-flop.

Always ON cells: Generally these are buffers, that remain always powered irrespective of where they are placed. They can be either special cells or regular buffers. If special cells are used, they have thier own secondary power supply and hence can be placed any where in the design. Using regular buffers as Always ON cells restricts the placement of these cells in a specific region.

In a nut-shell, "If data needs to be routed through or from sleep blocks to active blocks and If the routing distance is excessively long or the driving load is excessively large, then buffers might be needed to drive the nets. In these cases, the always-on buffers can be used."

Power Gating Switches/MTCMOS Switch: MTCMOS stands for multi-threshold CMOS, where low-Vt gates are used for speed, and high-Vt gates are used for low leakage. By using high-Vt transistors as header switches, blocks of cells can be switched off to sleep-mode, such that leakage power is greatly reduced. MTCMOS switches can be implemented in various different ways. First, they can be implemented as PMOS (header) or NMOS (footer) switches. Secondly, their granularity can be implemented on a cell-level (fine-grain) or on a block-level (coarse-grain). That is, the switches can be either built into every standard cell, or they can be used to switch off a large design block of standard cells.

Infrastructure Needs for Multi-Voltage Designs


Before we start looking at implementing a Multi-Voltage design there are certain questions that need to be answered to find out from process/library perspective such as

  1. Available Operating Voltages (PVT)
  2. Do we have special cells such as Level Shifters/Isolation cells/Power Gating Switches?
  3. If Level Shifter exists, what kinds of level shifters are available? (Ex: Enable Level Shifter etc)
  4. What are the different corners that need to be used for sign-off?
  5. How should we handle OCV?
  6. How accurate are these timing models? Is NLDM good enough or do we need CCS/ECSM models?
  7. Do we have special cells with Dual Rails(ex: Retention Flops)? If yes How is the timing captured for each rails?
  8. Are these cells characterized for Power, do they have State Dependent Path Dependent Information?
  9. If special cells exist, is it modeled according to EDA tools requirement?
  10. For Feed through implementation, do we have special Always On Buffers? What's the impact of routing the secondary power pins of these buffers on routing resources?
  11. Given range of Operating Voltages, is there an easy way at early stage of implementation cycle to judge on right Voltage selection? (power/performance product)
  12. Am I getting required power savings by implementing the design in Multi-Voltage style? For ex. If number of special cells required to implement this are too many, is it worth the effort? Can we look at an alternate way of saving power?

Multi Voltage magic


In the last few weeks i have been quite busy with a lot of research on low power design. There are so many tutorials on Low Power and everyone's concerns/questions seems to be dancing around Multi-Voltage design. What I could sense was there were lot of designers Implementing Multi-Supply(Power-Gating) as opposed to real Multi-Voltage design. There are 3 classics style of new design style i found

  1. Multi-Supply(Power Gating)
  2. Static Multi-Voltage
  3. Dynamic Voltage and Frequency Scaling

Lets us discuss about Multi-Voltage design in the forthcoming posts. Like any other design implementation, MV design has its own challenges and much more difficult to sign-off.

I would like to classify the different stages of the design into small segments so that we can discuss one by one.

  • Design Infrastructure
  • Architectural level consideration
  • Microarchitecture
  • RTL Design
  • RTL functional verification
  • Implementation
  • Functional Sign-Off
  • Silicon Signoff
  • Resources For Help

Next post is going to be a in-depth technical article i guess!

Vt Cells and Spacing Requirements


Multi-Vt placement/spacing concerns

I was just thinking about most common concerns faced today in addressing leakage power. Multi-Vt spacing requirement is something everyone faces as multi-vt has become more or less part of regular implementation flow, thought of sharing the same today.

What's Multi-Vt?

In trying to meet the stringent leakage requirements, usage of Multi-Vt cells has become more or less must have. Let me give a small description of the same.

Balancing timing and leakage power requires the use of multiple libraries whose cells operate at different threshold voltages. Cells, which operate at higher threshold voltage are slower and less leaky, where as cells that operate at lower threshold voltage are faster and very leaky. Optimization engines meet timing goals by using the low-Vth cells on critical timing paths and high-Vth cells on non-critical paths. The low- and high-Vth cells have the same footprint for equivalent functions. Depending upon where you perform optimization in the flow, these cells are either just swapped (ECO) or paths are resynthesized (Synthesis/Placement) to meet timing/leakage goals.

Is it so simple?

Even though the above sound simple, it comes with its own implications that need to be addressed during chip finishing stage to meet certain process requirements.

Maintaining the same footprint requires careful library design because the low-Vth cells have a different well implant to create their lower threshold voltage. If this implant extended to the edges of the cell, it could overlap the edge of an adjacent high-Vth cell. The cells are therefore designed with a small buffer space around the edges, Low Vt and High Vt cells can be placed side by side. Figure 1/2 below demonstrates a simple scenario of Vt spacing requirements

Problems can occur if cells of the same Vth type are placed with a small space between them and a filler cell of the opposite Vth type is used to fill the gap. This mismatched filler creates a gap in the implant regions that violates design rules. Typically this problem is addressed by inserting filler cells intelligently.

How should I handle it?

Most of the current generation P&R tools like ICC/Astro will insert suitable filler cells next to the VT cells, when needed. Typically, filler cells are used to fill any spaces between regular library cells to avoid planarity problems and provide electrical continuity for power and ground. Because the High Vt cells have a different diffusion layer over them, a High Vt filler cell needs to be placed between High Vt cells, and a Low Vt filler cell needs to be placed between Low Vt cells.

Free verilog simulator


Event simulation versus cycle simulation


By popular demand:

Event simulation allows the design to contain simple timing information - the delay needed for a signal to travel from one point to another. During simulation, signal changes are tracked in the form of events. A change at a certain time triggers an event after a certain delay. Events are sorted by time when they will occur, and when all events for a particular time have been handled, the simulated time is advanced to the time of the next scheduled event. How fast an event simulation runs depends on the number of events to be processed (the amount of activity in the model).

In cycle simulation, it is not possible to specify delays. A cycle-accurate model is used, and every gate is evaluated in every cycle. Cycle simulation therefore runs at a constant speed, regardless of activity in the model. Optimized implementations may make take advantage of low model activity to speed up simulation by skipping evaluation of gates whos inputs didn't change.

While event simulation can provide some feedback regarding signal timing, it is not a replacement for static timing analysis.
In comparison to event simulation, cycle simulation tends to be faster, to scale better, and to be better suited for hardware acceleration / emulation.

Updates


Last 2 weeks has witnessed a sudden surge in visitors and so i decided
to continue my experiments for some more time with some new
information or updates on some old articles.

I will try to make some time to answer your questions and queries.

Thanks

NOTICE


I have found that this Blog has not attracted enough enthusiasts as expected. So due to lack of participation i m forced to limit my posts.

If you want me to update this Blog at a regular pace, please participate actively by commenting and by making proposals.

AUTHOR

basic arithmetic


      • What are the representations for,
          1. zero in 2's compliment
          2. the most positive integer that can be represented using 2's compliment
          3. the most negative integer that can be represented using 2's compliment
            • Give the 8-digit hexadecimal equivalent of
                1. 3710
                2. -3276810
                3. 110111101010110110111110111011112
                • Do the following using 6-bit 2's complement arithmetic (a fancy way of saying, ordinary addition in base 2 keeping only 6 bits of your answer). Work using binary (base 2) notation. Remember that subtraction can be performed by negating the second operand and then adding it to the first operand.
                    1. 13 + 10
                    2. 15 - 18
                    3. 27 - 6
                    4. -6 - 15
                    5. 21 + (-21)
                    6. 31 + 12
                    7. What happened in the last addition and in what sense your answer is "right".
                    • "Complement and add 1" doesn't seem to be an obvious way to negate a two's complement number. By manipulating the expression A+(-A)=0, show that "complement and add 1" does produce the correct representation for the negative of a two's complement number. Hint: express 0 as (-1+1) and rearrange terms to get -A on one side and XXX+1 on the other and then think about how the expression XXX is related to A using only logical operations (AND, OR, NOT).
                        • What range of numbers can be represented with an N-bit sign-magnitude number? With an N-bit two's-complement number?
                            • Create a Verilog module that converts an N-bit sign-magnitude input into an N-bit two's complement output.

                                  Testing


                                  You get the final chip back from the FAB. Now you do the smoke test(power up). Hopefully assuming that things are well (your chip does not go up in smoke) you do the first actual test of data using a sample and a simple test case. Unfortunately the chip does not seem to function as you it is intended to.

                                  How do you know if your chip has a setup time or a hold time problem?

                                  Significance of contamination delay in sequential circuit timing


                                  Q: What is the significance of contamination delay in sequential circuit timing?
                                  Fact: 70-80% of designers who deal with timing closure daily are unaware of this fact.

                                  So, what is contamination delay anyway?
                                  Look at the figure below. tcd is the contamination delay.


















                                  Without understanding contamination delay you "should not" complete timing estimation of any sequential circuit.

                                  What do you mean by that?
                                  Contamination delay tells you if you meet the hold time of a flip flop. To understand this better please look at the sequential circuit below.











                                  The contamination delay of the data path in a sequential circuit is critical for the hold time at the flip flop where it is exiting, in this case R2.

                                  mathematically,
                                  th(R2) <= tcd(R1) + tcd(CL2) Contamination delay is also called tmin and Propagation delay is also called tmax in many data sheets.

                                  Fifo












                                  Q. Given the following FIFO and rules, how deep does the FIFO needs to be to prevent underflowing or overflowing?
                                  RULES:
                                  1) frequency(clk_A) = frequency(clk_B) / 4
                                  2) period(en_B) = period(clk_A) * 100
                                  3) duty_cycle(en_B) = 25%

                                  This Q was picked from..
                                  http://grumpytom.com/Interview_Questions/questions.html

                                  Sol: The Q is completely wrong...
                                  1) Underflow in a fifo cannot be prevented by depth calculation.
                                  2) The calculation, the author has provided is unecessary for the question ;-)

                                  FSM based Interview Question


















                                  • Calculate the size of the ROM if the sequential element is 'n' bits wide.
                                  • What is the number of locations and the number of bits in each location?
                                  • What is the least upper bound on the number of states in the above FSM?

                                  research research & research again


                                  1. research takes time. spend long hours performing design, simulations, experiments, testing, and writing.
                                  2. read the literature to obtain background for your research topic. the literature will NOT contain the answer to your research question.
                                  3. resist the urge to keep reading the literature forever. You must do something. If you do not know where to begin, recreate some work found in the recent literature. In doing so, you are bound to come across open research questions.
                                  4. you must perform the research. nobody will tell you every step to make. if they do they will take advantage of you. beware.
                                  5. if you encounter a problem in your research, try to figure it out yourself. try approaching the problem from several different scenarios. if you cannot find success, schedule a meeting with your trusted mentor at your convenience. be sure to have specific questions prepared for this session.
                                  follow these steps and you will invent something...

                                  Approaches that can ease multi-clock designs


                                  Problem 1: Metastability
                                  Solution: Use appropriate synchronizers. (please read my earlier articles to understand the types)

                                  Problem 2: Reset Synchronization
                                  Solution: Proper care is to be taken when deasserting resets as they can make sequential elements to go into metastable state. Have a seperate synchronizer for the reset signal as it enters each clock domain.

                                  Problem 3: Glitches across clock boundaries
                                  Solution: Signals entering synchronizers should be driven directly by a flip flop from the previous clock domain and not a combination logic.

                                  Problem 4: Insufficient hold time in the receiving clock domain
                                  Solution: A signal passing from a fast clock domain to a slow clock domain must be stable for multiple clock cycles in the driving domain to ensure that the slower clock domain will not miss a transition entirely.

                                  Problem 5: Loss of signal correlation
                                  Solution: Ways in which loss of correlation can occur are bits of buses, various copies of single signal, hanshake signals and signal convergence. So use gray code to avoid loss of correlation.

                                  Details:
                                  • Do extensive structural analysis of the design in RTL form to trace all clocks and resets as this can identify
                                    1. all asynchronous clock domains in the design
                                    2. all control and data signals crossing between clock domains
                                    3. any domain crossing signals that have missing or incorrect synchronizers
                                    4. any synchronizer that has the potential for glitches on their inputs
                                    5. any signals that have fanouts to multiple synchronizers
                                    6. any idependantly synchronized signals that reconverge in the receiving clock domain
                                    7. any clock domains whose reset signals are not properly synchronized
                                    8. any gated or derived clocks with glitch potential
                                  The above can be automated and can be run on designs of any complexity.

                                  words of wisdom:
                                  Human beings are organic. Organic matter in general are inefficient by nature and hence inorganic matter developed by such organic matter are also inefficient.
                                  If we were efficient would there something called debugging?

                                  Key points in Logic Design Timing


                                  • In a synchronous design, all memory elements, flip-flops and latches, are synchronized be a master clock.
                                  • A timing diagram is used to graphically describe what happens at each flip-flop or latch output on every clock cycle.
                                  • Clock distribution circuits are designed to minimize the clock skew and jitter. Clock skew is point to point variation in the clock arrival time, while jitter is cycle to cycle variation.
                                  • The critical path is the slowest logic path in the design.
                                  • Flip-flops sample the input D and transfer it to the output Q at each rising or falling edge of the clock.
                                  • The logic input D can not change during the set-up time before, and the hold time, after the clock edge.
                                  • A set-up violation occurs when the input D arrives during or after the set-up time for the critical path under worst-case timing conditions. The usual fix for a set-up violation is to reduce the logic delay.
                                  • A hold violation occurs when a signal races through two levels of logic, through the fastest path, before the hold time is over under best case timing conditions. The usual fix is to insert logic delay.
                                  • Latches are level sensitive devices.
                                  • Preventing set-up violations can be easier because latch D inputs can change while the clock is high. Exploiting this fact is called "cycle stealing".
                                  • Preventing hold violations is much harder due to the inclusion of the clock high time. Thus latch-based designs are not recommended.
                                  • Cell libraries are carefully characterized for timing. To calculate the delay through a cell library, you need to know the load capacitance.
                                  • Timing analysis is performed during and after synthesis, and again before fabrication.

                                  FSM Questions


                                  • Design an FSM that has 1 i/p and 1 o/p. The o/p becomes 1 and remains 1 when at least two 0's and two 1's have occurred as i/p's.
                                  • Design a "%3" FSM that accepts one bit at a time, most significant bit first, and indicates if the number is divisible by 3.
                                  • If an FSM is redesigned using a state register with minimum number of bits after connecting the output of a 3-state FSM to the inputs of an 9-state FSM, what is the maximum number of bits needed?

                                  Verilog Question


                                  A 3:1 mux has three data inputs D0, D1 and D2, two select inputs S0 and S1 and one data output Y.

                                  The value of output Y is calculated as follows:
                                  Y = D0 if S0 = 0 and S1 = 0
                                  Y = D1 if S0 = 1 and S1 = 0
                                  Y = D2 if S1 = 1 (the value S0 doesn't matter)
                                  1. Write a Verilog module for the 3:1 multiplexer using dataflow style modelling (assign) that implements the sum-of-products equation for Y.
                                  2. Write a Verilog module for the 3:1 multiplexer that uses the "?" operator. Again use a dataflow style for your code (ie, use "assign" to set the values of your signals).
                                  3. Write a Verilog module for the 3:1 multiplexer that uses the "case" statement. Remember that a "case" statement can only appear inside of a behavioral block.

                                  sequential circuits











                                  • In order for the circuit shown above to operate correctly what constraints on tH and tS are necessary? Express them in terms of tCD, tPD and the clock period.
                                    • What is the minimum clock period at which this circuit can be clocked and still be guaranteed to work? Express your answer in terms of tH, tS, tCD and tPD. Assume that timing constraints that do not depend on the clock period are met.
                                    • For just this question suppose there is skew in the CLK signal such that the rising edge of CLK arrives at the flip-flop labeled F1 1ns before it arrives at the other three flip-flops. Assume that hold times are not violated. How does this change the minimum clock period at which the circuit above can be clocked and still be guaranteed to work?

                                    sequential circuit

















                                    • Assuming that the clock period is 25ns, what is the maximum setup time for the registers for which this circuit will operate correctly?
                                      • Assuming that the clock period is 25ns, what is the maximum hold time for the registers for which this circuit will operate correctly?

                                      sequential circuit


















                                      • What is the smallest value for the ROM's contamination delay that ensures the necessary timing specifications are met?
                                        • Assume that the ROM's tCD = 3ns. What is the smallest clock period that ensures that the necessary timing specifications are met.

                                          sequential circuit













                                          • What is the smallest clock period for which the circuit still operates correctly?
                                          • By removing the pair of inverters and connecting the Q output of the left register directly to the D input of the right register, if the clock period could be adjusted appropriately, would the optimized circuit operate correctly? If so, explain the adjustment to the clock period that would be needed.
                                          • When the RESET signal is set to "1" for several cycles, what values are loaded into the registers? (Give values for S0 and S1)
                                          • Assuming the RESET signal has been set to "0" and will stay that way, what value will with the registers have after the next clock edge assuming the current values are S0=1 and S1=1?
                                          • Now suppose there is skew in the CLK signal such that the rising edge of CLK always arrives at the left register exactly 1ns before it arrives at the right register. What is the smallest clock period for which the FSM still operates correctly?

                                          sequential circuit


                                          Calculate timing parameters for the system as a whole taking into account d1 and d2. Don't make any assumption about the relative sizes of the two delays.

                                          sequential circuit


                                          Calculate the timing parameters (tS, tH, tCD, tPD, tCLK) for this system as a whole.

                                          TCP Q&A


                                          1. What is TCP?

                                            Transmission Control Protocol (TCP) provides a reliable byte-stream transfer service between two endpoints on an internet. TCP depends on IP to move packets around the network on its behalf. IP is inherently unreliable, so TCP protects against data loss, data corruption, packet reordering and data duplication by adding checksums and sequence numbers to transmitted data and, on the receiving side, sending back packets that acknowledge the receipt of data.

                                            Before sending data across the network, TCP establishes a connection with the destination via an exchange of management packets. The connection is destroyed, again via an exchange of management packets, when the application that was using TCP indicates that no more data will be transferred. In OSI terms, TCP is a Connection-Oriented Acknowledged Transport protocol.

                                            TCP has a multi-stage flow-control mechanism which continuously adjusts the sender's data rate in an attempt to achieve maximum data throughput while avoiding congestion and subsequent packet losses in the network. It also attempts to make the best use of network resources by packing as much data as possible into a single IP packet, although this behaviour can be overridden by applications that demand immediate data transfer and don't care about the inefficiencies of small network packets.

                                            The fundamentals of TCP are defined in RFC 793, and later RFC's refine the protocol. RFC 1122 catalogues these refinements as of October 1989 and summarises the requirements that a TCP implementation must meet.

                                            TCP is still being developed. For instance, RFC 1323 introduces a TCP option that can be useful when traffic is being carried over high-capacity links. It is important that such developments are backwards-compatible. That is, a TCP implementation that supports a new feature must continue to work with older TCP implementations that do not support that feature.

                                          2. How does TCP try to avoid network meltdown?

                                            TCP includes several mechanisms that attempt to sustain good data transfer rates while avoiding placing excessive load on the network. TCP's "Slow Start", "Congestion Avoidance", "Fast Retransmit" and "Fast Recovery" algorithms are summarised in RFC 2001. TCP also mandates an algorithm that avoids "Silly Window Syndrome" (SWS), an undesirable condition that results in very small chunks of data being transferred between sender and receiver. SWS Avoidance is discussed in RFC 813. The "Nagle Algorithm", which prevents the sending side of TCP from flooding the network with a train of small frames, is described in RFC 896.

                                            Van Jacobson has done significant work on this aspect of TCP's behaviour. The FAQ used to contain a couple of pieces of historically interesting pieces of Van's email concerning an early implementation of congestion avoidance, but in the interests of saving space they've been removed and can instead be obtained by anonymous FTP from the end-to-end mailing list archive at <ftp://ftp.isi.edu/end2end/end2end-1990.mail>. PostScript slides of a presentation on this implementation of congestion avoidance can be obtained by anonymous FTP from <ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z>.

                                            That directory contains several other interesting TCP-related papers, including one (<ftp://ftp.ee.lbl.gov/papers/fastretrans.ps>) by Sally Floyd that discusses a algorithm that attempts to give TCP the ability to recover quickly from packet loss in a network.

                                          3. How do applications coexist over TCP and UDP?

                                            Each application running over TCP or UDP distinguishes itself from other applications using the service by reserving and using a 16-bit port number. Destination and source port numbers are placed in the UDP and TCP headers by the originator of the packet before it is given to IP, and the destination port number allows the packet to be delivered to the intended recipient at the destination system.

                                            So, a system may have a Telnet server listening for packets on TCP port 23 while an FTP server listens for packets on TCP port 21 and a DNS server listens for packets on port 53. TCP examines the port number in each received frame and uses it to figure out which server gets the data. UDP has its own similar set of port numbers.

                                            Many servers, like the ones in this example, always listen on the same well-known port number. The actual port number is arbitrary, but is fixed by tradition and by an official allocation or "assignment" of the number by the Internet Assigned Numbers Authority (IANA).

                                          4. Where do I find assigned port numbers?

                                            The IANA allocates and keeps track of all kinds of arbitrary numbers used by TCP/IP, including well-known port numbers. The entire collection is published periodically in an RFC called the Assigned Numbers RFC, each of which supersedes the previous one in the series. The current Assigned Numbers RFC is RFC 1700.

                                            The Assigned Numbers document can also be obtained directly by FTP from the IANA at <ftp://ftp.isi.edu/in-notes/iana/assignments>.

                                          5. What are Sockets? A socket is an abstraction that represents an endpoint of communication. Most applications that consciously use TCP and UDP do so by creating a socket of the appropriate type and then performing a series of operations on that socket. The operations that can be performed on a socket include control operations (suchas associating a port number with the socket, initiating or accepting a connection on the socket, or destroying the socket) data transfer operations (such as writing data through the socket to some other application, or reading data from some other application through the socket) and status operations (such as finding the IP address associated with the socket). The complete set of operations that can be performed on a socket constitutes the Sockets API (Application Programming Interface). If you are interested in writing programs that use TCP/IP then you'll probably need to use and understand the sockets API. Your system manuals may have a description of the API (try `man socket' if you're using a Unix system) and many books devote chapters to it. A FAQ list for sockets programming is available on the Web from its Canadian home at from a UK mirror at or by anonymous FTP from . The TLI (Transport Layer Interface) API provides an alternative programming interface to TCP/IP on some systems, notably those based on AT&T's System V Unix. The Open Group, a Unix standards body, defines a variation of TLI called XTI (X/Open Transport Interface). Note that both sockets and TLI (and XTI) are general-purpose facilities and are defined to be completely independent of TCP/IP. TCP/IP is just one of the protocol families that can be accessed through these API.
                                          6. How can I detect that the other end of a TCP connection has crashed? Can I use "keepalives" for this?
                                            Detecting crashed systems over TCP/IP is difficult. TCP doesn't require any transmission over a connection if the application isn't sending anything, and many of the media over which TCP/IP is used (e.g. Ethernet) don't provide a reliable way to determine whether a particular host is up. If a server doesn't hear from a client, it could be because it has nothing to say, some network between the server and client may be down, the server or client's network interface may be disconnected, or the client may have crashed. Network failures are often temporary (a thin Ethernet will appear down while someone is adding a link to the daisy chain, and it often takes a few minutes for new routes to stabilize when a router goes down) and TCP connections shouldn't be dropped as a result.

                                            Keepalives are a feature of the sockets API that requests that an empty packet be sent periodically over an idle connection; this should evoke an acknowledgement from the remote system if it is still up, a reset if it has rebooted, and a timeout if it is down. These are not normally sent until the connection has been idle for a few hours. The purpose isn't to detect a crash immediately, but to keep unnecessary resources from being allocated forever. If more rapid detection of remote failures is required, this should be implemented in the application protocol. There is no standard mechanism for this, but an example is requiring clients to send a "no-op" message every minute or two. An example protocol that uses this is X Display Manager Control Protocol (XDMCP), part of the X Window System, Version 11; the XDM server managing a session periodically sends a Sync command to the display server, which should evoke an application-level response, and resets the session if it doesn't get a response (this is actually an example of a poor implementation, as a timeout can occur if another client "grabs" the server for too long).
                                          7. Can the TCP keepalive timeouts be configured?
                                            This varies by operating system. There is a program that works on many Unices (though not Linux or Solaris), called netconfig, that allows one to do this and documents many of the variables. It is available by anonymous FTP from . In addition, Richard Stevens' TCP/IP Illustrated, Volume 1 includes a good discussion of setting the most useful variables on many platforms.

                                          IP Fragmentation Q&A


                                          1. What is meant by IP fragmentation?
                                          2. The breaking up of a single IP datagram into two or more IP datagrams of smaller size is called IP fragmentation.

                                          3. Why is an IP datagram fragmented?
                                          4. Every transmission medium has a limit on the maximum size of a frame (MTU) it can transmit. As IP datagrams are encapsulated in frames, the size of IP datagram is also restricted. If the size of An IP datagram is greater than this limit, then it must be fragmented.

                                          5. Which RFCs discuss IP fragmentation?
                                          6. RFC 791 & RFC 815 discusses about IP datagrams, fragmentation and reassembly.

                                          7. Is it possible to select an IP datagram size to always avoid fragmentation?
                                          8. It is not possible to select a particular IP datagram size to always avoid fragmentation, as the MTU for different transmission It is possible, though, for a given path to choose a size that will not lead to fragmentation. This is called Path MTU Discovery and is discussed in the RFC 1191. The TCP transport protocol tries to avoid fragmentation using the Maximum Segment Size (MSS) option.

                                          9. Where an IP datagram may get fragmented?
                                          10. An IP datagram may get fragmented either at the sending host or at one of the intermediate routers.

                                          11. Where are the IP datagram fragments reassembled?
                                          12. The IP fragments are reassembled only at the destination host.

                                          13. How to prevent an IP datagram from being fragmented?
                                          14. A IP datagram can be prevented from fragmentation, by setting the "don't fragment" flag in the IP header.

                                          15. What happens when a datagram must be fragmented to traverse a network, but the "don't fragment" flag in the datagram is set?
                                          16. The datagram whose "don't fragment" flag is set is discarded, if it must be fragmented to traverse a network. Also, a ICMP error message is sent back to the sender of the datagram.

                                          17. Will all the fragments of a datagram reach the destination using the same path?
                                          18. The different fragments of the same IP datagram can travel in either in the same path or in different paths to the destination.

                                          19. Will all the fragments of a datagram arrive at the destination system in the correct order?
                                          20. The different fragments of a single IP datagram can arrive in any order to the destination system.

                                          21. What happens to the original IP datagram when one or more fragments are lost?
                                          22. When one or more fragments of an IP datagram are lost, then the entire IP datagram is discarded after a timeout period.

                                          23. What is the minimum size of an IP fragment?
                                          24. The minimum size of an IP fragment is the minimum size of an IP header plus eight data bytes. Most firewall-type devices will drop an initial IP fragment (offset 0) that does not contain enough data to hold the transport headers. In other words, the IP fragment normally need 20 octets of data in addition to the IP header in order to get through a firewall if offset is 0.

                                          25. What are the limitations on the size of a fragment?
                                          26. The size of an IP datagram fragment is limited by

                                            1. The amount of remaining data in the original IP datagram
                                            2. The MTU of the network and
                                            3. Must be a multiple of 8, except for the final fragment.

                                          27. How is an IP datagram fragment differentiated from a non-fragmented IP datagram?
                                          28. A complete IP datagram is differentiated from an IP fragment using the offset field and the "more fragments" flags. For a non-fragmented IP datagram, the fragment offset will be zero and the "more fragments" flag will be set to zero.

                                          29. How are the fragments of a single IP datagram identified?
                                          30. The "identification" field in the IP header is used to identify the fragments of a single IP datagram. The value of this field is set by the originating system. It is unique for that source-destination pair and protocol for the duration in which the datagram will be active.

                                          31. How is the last fragment of an IP datagram identified?
                                          32. The last fragment of an IP datagram is identified using the "more fragments" flag. The "more fragment" flag is set to zero for the last fragment.

                                          33. How is the length of a complete IP datagram calculated from the received IP fragments?
                                          34. Using the fragment offset field and the length of the last fragment, the length of a complete IP datagram is calculated.

                                          35. How is an IP datagram fragmented?
                                          36. In the following example, an IP datagram is fragmented into two. This same algorithm can be used to fragment the datagram into 'n' fragments.

                                            1. The IP layer creates two new IP datagrams, whose length satisfies the requirements of the network in which the original datagram is going to be sent.
                                            2. The IP header from the original IP datagram is copied to the two new datagrams.
                                            3. The data in the original IP datagram is divided into two on an 8 byte boundary. The number of 8 byte blocks in the first portion is called Number of Fragment Blocks (NFB).
                                            4. The first portion of the data is placed in the first new IP datagram.
                                            5. The length field in the first new IP datagram is set to the length of the first datagram.
                                            6. The fragment offset field in the first IP datagram is set to the value of that field in the original datagram.
                                            7. The "more fragments" field in the first IP datagram is set to one.
                                            8. The second portion of the data is placed in the second new IP datagram.
                                            9. The length field in the second new IP datagram is set to the length of the second datagram.
                                            10. The "more fragments" field in the second IP datagram is set to the same value as the original IP datagram.
                                            11. The fragment offset field in the second IP datagram is set to the value of that field in the original datagram plus NFB.

                                          37. How a destination system reassembles the fragments of an IP datagram?
                                            1. When a host receives an IP fragment, it stores the fragment in a reassembly buffer based on its fragment offset field.
                                            2. Once all the fragments of the original IP datagram are received, the datagram is processed.
                                            3. Upon receiving the first fragment, a reassembly timer is started.
                                            4. If the reassembly timer expires before all the fragments are received, the datagram is discarded.

                                          38. What fields are changed in an IP header due to fragmentation?
                                          39. The following IP header fields are changed due to IP fragmentation:

                                            1. Total Length
                                            2. Header Length
                                            3. More Fragments Flag
                                            4. Fragment Offset
                                            5. Header Checksum
                                            6. Options

                                          40. What happens to the IP options field when an IP datagram is fragmented?
                                          41. Depending on the option, either it is copied to all the fragments or to only the first fragment.

                                          42. Which IP options are copied to all the fragments of an IP datagram?
                                          43. If the most significant bit in the option type is set (i.e. value one), then that option is copied to all the fragments. If it is not set (i.e. value zero), it is copied only to the first fragment.

                                          IP Addressing Q&A


                                          IP Q&A


                                          1. What is IP?
                                          2. Internet Protocol (IP) is an unreliable, best effort delivery, connection-less protocol used for transmitting and receiving data between hosts in a TCP/IP network.

                                          3. To which OSI layer does IP belong?
                                          4. IP belongs to the Network Layer (layer 3) in the OSI model.

                                          5. Which RFC discusses IP?
                                          6. RFC 791 discusses about the IP protocol version 4.

                                          7. Which version of IP is discussed in this document?
                                          8. IP version 4 (IPv4) is discussed in this document.

                                          9. What do you mean by IP is an unreliable protocol?
                                          10. IP is a unreliable protocol because it does not guarantee the delivery of a datagram to its destination. The reliability must be provided by the upper layer protocols like TCP. IP does not support flow control, retransmission, acknowledgement and error recovery.

                                          11. What do you mean by IP is a best-effort protocol?
                                          12. IP is a best-effort protocol, because it will make every effort to always transmit a datagram and also datagrams will not be just discarded. However, the delivery of the datagram to the destination is not guaranteed.

                                          13. What do you mean by IP is a connection-less protocol?
                                          14. IP is a connection-less protocol because it does not maintain state information about the connection to a destination host. Each datagram is handled independent of other datagrams and also each datagram may reach the destination through different network routes.

                                          15. What is the role of IP in the TCP/IP protocol suite?
                                          16. IP is used for

                                            1. Transmitting data from higher-level protocols like TCP, UDP in IP datagrams, from one host to another host in the network.
                                            2. Identifying individual hosts in a network using an IP address.
                                            3. Routing datagrams through gateways and
                                            4. Fragmenting and reassembling datagrams based on the MTU of the underlying network.

                                          17. What is an IP Datagram?
                                          18. An IP datagram is a basic unit of information used by the IP layer to exchange data between two hosts. A IP datagram consists of an IP header and data.

                                          19. How higher-level data is carried by IP to a destination host?
                                          20. The data from higher-level protocols like TCP, UDP is encapsulated in an IP datagram and transmitted to the destination host. IP will not modify the higher-level data.

                                          21. What is the minimum and maximum size of an IP datagram?
                                          22. The minimum size of an IP datagram is 576 bytes and the maximum size is 65535 bytes.

                                          23. What is the minimum and maximum size of an IP datagram header?
                                          24. The minimum size of an IP datagram header is 20 bytes. The maximum IP datagram header size is 60 bytes.

                                          25. Is there a limitation on the minimum size of a IP datagram a network can handle?
                                          26. Yes. All IP networks must be able to handle datagrams of at least 576 bytes in length.

                                          27. What are the fields in an IP datagram header?
                                          28. The various fields in an IP datagram header and their size in bits are shown below:

                                                +-------------+
                                            Version 4 bits
                                            +-------------+
                                            IP Header 4 bits
                                            Length
                                            +-------------+
                                            Type of 8 bits
                                            Service
                                            +-------------+
                                            Size of the 16 bits
                                            Datagram
                                            +-------------+
                                            Datagram ID 16 bits
                                            +-------------+
                                            Control 3 bits
                                            Flags
                                            +-------------+
                                            Fragment 13 bits
                                            Offset
                                            +-------------+
                                            Time to 8 bits
                                            Live
                                            +-------------+
                                            Protocol 8 bits
                                            +-------------+
                                            Header 16 bits
                                            Checksum
                                            +-------------+
                                            Source IP 32 bits
                                            Address
                                            +-------------+
                                            Destination 32 bits
                                            IP Address
                                            +-------------+
                                            Options Variable Length
                                            +-------------+
                                            The various fields are explained below:
                                                +-----------+-----------------------------------------------------+
                                            Version IP protocol version. For IPv4, this value is 4.
                                            +-----------+-----------------------------------------------------+
                                            IP Header Length of the IP header in multiples of
                                            Length 32-bit words.
                                            +-----------+-----------------------------------------------------+
                                            Type of Quality of Service(QOS) requested for this datagram.
                                            Service
                                            (TOS)
                                            +-----------+-----------------------------------------------------+
                                            Datagram Length of the entire datagram in bytes, including
                                            Size the header and the payload.
                                            +-----------+-----------------------------------------------------+
                                            Datagram Current datagram identifier.
                                            ID
                                            +-----------+-----------------------------------------------------+
                                            Control Bit 0: Reserved
                                            Flags Bit 1: 0 - Allow fragment, 1 - Don't fragment.
                                            Bit 2: 0 - Last fragment, 1 - More fragments.
                                            +-----------+-----------------------------------------------------+
                                            Fragment Specifies the offset in the original IP datagram,
                                            Offset where this fragment begins. This is a multiple of
                                            32 bit words.
                                            +-----------+-----------------------------------------------------+
                                            Time to The time upto which this datagram can live in the
                                            Live network.
                                            (TTL)
                                            +-----------+-----------------------------------------------------+
                                            Protocol Indicates to which upper-layer protocol layer this
                                            datagram should be delivered. e.g. TCP, UDP
                                            +-----------+-----------------------------------------------------+
                                            Header IP header checksum.
                                            Checksum
                                            +-----------+-----------------------------------------------------+
                                            Source IP IP address of the source host sending this IP
                                            Address datagram.
                                            +-----------+-----------------------------------------------------+
                                            Target IP IP address of the destination host to which this
                                            Address IP datagram must be delivered.
                                            +-----------+-----------------------------------------------------+
                                            Options Used for timestamps, security, source routing, etc.
                                            +-----------+-----------------------------------------------------+

                                          29. What is the byte order used for transmitting datagram headers in the TCP/IP protocol suite?
                                          30. All the datagram headers in the TCP/IP protocol suite are transmitted in the "big endian" byte order. i.e. The most significant byte is transmitted first. This is also called as "network byte order".

                                          31. Why there are two length fields (IP header length, IP datagram length) in the IP header?
                                          32. The size of the IP header is not fixed. Depending on the IP options present, the size of the IP header will vary. A separate field for the IP header length is added, so that the destination system can separate the IP datagram header from the payload.

                                          33. How is the value for datagram identifier calculated?
                                          34. The IP datagram identifier is just a sequence number assigned by the transmitting host. The algorithm for assigning value to this field is not specified by the IP protocol.

                                          35. What is the use of datagram identifier field?
                                          36. The IP datagram identifier field is used to uniquely identify and assemble the different fragments of an IP datagram.

                                          37. Is the datagram identifier field unique for each IP datagram?
                                          38. Yes. The IP datagram identifier field is different for each IP datagram transmitted. The fragments of an IP datagram will have the same identifier value.

                                          39. What is the use of Type Of Service field in the IP header?
                                          40. The Type Of Service (TOS) field is used TCP to describe the desired quality of service for an IP datagram by upper layer protocols like TCP. This field can be used to specify the nature and priority of a IP datagram (like Network Control, Immediate, Critical, etc) and the criteria for selecting a path for forwarding a datagram by a gateway.

                                          41. What are the different types of criteria can be specified using the TOS field?
                                          42. The different types of criteria that can be specified by the TOS field in an IP datagram are:

                                            1. Minimize delay,
                                            2. Maximize throughput
                                            3. Maximize reliability
                                            4. Minimize cost and
                                            5. Normal service.

                                          43. Which RFC discusses the Type Of Service (TOS) field?
                                          44. RFC 1349 discusses the Type Of Service (TOS) field.

                                          45. What is the use of the Time To Live (TTL) field in the IP header?
                                          46. The TTL field is used to limit the lifetime of a IP datagram and to prevent indefinite looping of IP datagrams.

                                          47. How is the TTL field used to prevent indefinite looping of IP datagrams?
                                          48. The TTL field contains a counter value set by the source host. Each gateway that processes this datagram, decreases the TTL value by one. When the TTL value reaches zero, the datagram is discarded.

                                          49. What is the typical value for the TTL field?
                                          50. The typical value for a TTL field is 32 or 64.

                                          51. When is a datagram considered undeliverable?
                                          52. If a datagram cannot be delivered to the destination host due to some reason, it is considered an undeliverable datagram.

                                          53. How a datagram becomes an undeliverable datagram?
                                          54. A datagram may become undeliverable, if

                                            1. The destination host is down.
                                            2. The route for the destination host is not found.
                                            3. A network in the route to the destination host is down.
                                            4. The Time To Live (TTL) value of the datagram becomes zero.

                                          55. What happens to an undeliverable datagram?
                                          56. An undeliverable datagram is discarded and an ICMP error message is sent to the source host.

                                          57. Is it possible for an IP datagram to be duplicated?
                                          58. Yes. A host may receive the same copy of an IP datagram twice. It is upto the higher layer protocols to discard the duplicate copy of the datagram.

                                          59. Which part of the IP datagram is used for calculating the checksum?
                                          60. The checksum field in the IP header covers only the IP header. The payload data is not used for calculating this checksum.