Taking a floating-point representation of an algorithm into a fixed-point representation is an integral step on the path towards implementation. Unfortunately, it is known to be time-consuming, tedious and error prone. This paper from synopsys describes proven methodologies on how to overcome the hurdles, including many useful tips to get you to results faster.

Showing posts with label

**Algorithms**. Show all posts
Showing posts with label

**Algorithms**. Show all postsHome » Posts filed under Algorithms

## Formal Verification: Theorem proving

How many times in the course of a project have you heard of the term Formal Verification? This relatively short on article theorem proving article assumes that you know the basic definition of Formal Verification while we try to see what Theorem proving is in relation to Formal Verification.

The theorem proving techniques relate specification and implementation as a theorem in logic to be proven in the context of a proof calculus. The implementation acts as a provider of axioms and assumptions for the proof. Like the math classes we had in school, involves proving of lemmas and theorems by deduction, rewriting and induction. This is mainly used for verification at a higher level of abstraction usually in algorithmic verification. This is more successful in complex problems and systems like distributed algorithms and real-time systems.

The first tools for this method were available in 1970's. Currently there exists a variety of tools which are more or less automatic. some of them being, ACL2, NQTHM, PVS, HOLY, Isabelle, STeP...

However this technique requires a strong user expertise and time intenseive guidance of the proof. The usual theorem prover tools are based on first order predicate logic, Higher-order predicate logic, Recursive functions, Induction etc. Eg., "The ACL2 logic is a first order logic of total recursive functions providing mathematical induction..."

In summary, the theorem proving tools use sophisticated combination of decision proedures, conditional rewriting, Induction & recursion, propositional simplification, complex heuristics to orchestrate all of that. The tool learns over its lifetime by making use automatically of lemmas being proven by the user.

The theorem proving techniques relate specification and implementation as a theorem in logic to be proven in the context of a proof calculus. The implementation acts as a provider of axioms and assumptions for the proof. Like the math classes we had in school, involves proving of lemmas and theorems by deduction, rewriting and induction. This is mainly used for verification at a higher level of abstraction usually in algorithmic verification. This is more successful in complex problems and systems like distributed algorithms and real-time systems.

The first tools for this method were available in 1970's. Currently there exists a variety of tools which are more or less automatic. some of them being, ACL2, NQTHM, PVS, HOLY, Isabelle, STeP...

However this technique requires a strong user expertise and time intenseive guidance of the proof. The usual theorem prover tools are based on first order predicate logic, Higher-order predicate logic, Recursive functions, Induction etc. Eg., "The ACL2 logic is a first order logic of total recursive functions providing mathematical induction..."

**Brush up of basics:***Predicate logic in which predicates take only individuals as arguments and quantifiers only bind individual variables.*

First-order predicate logic:First-order predicate logic:

*Induction:*A powerful technique for proving that a theorem holds for al cases in a large infinite set. The proof has two steps, the basis and induction step. Roughly, in the basis the theorem is proved to hold for the "ancestor" case, and in the induction step it is proved to hold for all "descendant" cases.*Higher-order predicate logic:*Predicate logic in which predicates take other predicates as arguments and quantifiers bind predicate variables. For example, second order predicates take first-order predicates as arguments. Oder n predicates take order n-1 predicates as arguments (n>1).In summary, the theorem proving tools use sophisticated combination of decision proedures, conditional rewriting, Induction & recursion, propositional simplification, complex heuristics to orchestrate all of that. The tool learns over its lifetime by making use automatically of lemmas being proven by the user.

**Bottomline:**__Very powerful mainly at algorithmic level. However you should go get an expert :-)__## Memories - Test Patterns & Algorithms - Part 3

In this section some classical, or legacy, memory test algorithms will be examined. Memory test algorithms fall into two categories: functional and dynamic. A functional test targets defects within a memory cell, as well as failures that occur when cell contents are altered by a read or write to another cell. A dynamic test attempts to find access time failures. The All 1s or All 0s tests are examples of functional tests. These tests write 1s or 0s into all memory cells in order to detect individual cell defects including shorts and opens. However, these tests are not effective at finding other failure types.

A memory test pattern that tests for address nonuniqueness and other functional faults in memories, as well as some dynamic faults, is the GALPAT (GALloping PATtern), sometimes referred to as a ping-pong pattern. This pattern accesses each address repeatedly using, at some point, every other cell as a previous address. It starts by writing a background of zeroes into all memory cells. Then the first cell becomes the test cell. It is complemented and read alternately with every other cell in memory. Each succeeding cell then becomes the test cell in turn and the entire read process is repeated. All data are complemented and the entire test is repeated. If each read and compare is counted as one operation, then GALPAT has an execution time proportional to 4N^2, where N is the number of cells. It is effective for finding cell opens, shorts, address uniqueness faults, sense amplifier interaction, and access time problems.

Walking Pattern is similar to the GALPAT except that the test cell is read once and then all other cells are read. To create a Walking Pattern from the GALPAT program, omit the second read operation in the testbench. The Walking Pattern has an execution time proportional to 2N^2 . It checks memory for cell opens and shorts and address uniqueness.

March, like most of the algorithms, begins by writing a background of zeroes. Then it reads the data at the first location and writes a 1 to that address. It continues this read/write procedure sequentially with each address in memory. When the end of memory is reached, each cell is read and changed back to zero in reverse order. The test is then repeated using complemented data. Execution time is of order N. It can find cell opens, shorts, address uniqueness, and some cell interactions.

Galloping Diagonal is similar to GALPAT in that a 1 is moved through memory. However, it is moved diagonally, checking both row and column decoders simultaneously. It is of order 4N3 /2. Row and column GALPATs of order 4N^(3/2) also exist. Sliding Diagonal (see Figure) writes a complete diagonal of 1s against a background of 0s and then, after reading all memory cells, it shifts the diagonal horizontally. This continues until the diagonal of 1s has passed through all memory locations. The Diagonal test, of order N, will verify address uniqueness at a signifi-cant speed enhancement over the Walk or GALPAT.

Surround Read Disturb starts by creating a background of all 0s. Then, each cell in turn becomes the test cell. The test cell is complemented and the eight physically adjacent cells are repeatedly read. After a number of iterations the test cell is read to determine if it has been affected by the read of its neighbors. The operation is then repeated for a background of 1s. The intent is to find disturbances caused by adjacent cell operations. Execution time depends on the number of read cycles but is of the order N. Surround Write Disturb is identical to the Surround Read Disturb except that a write rather than a read is performed.

Write Recovery writes a background of 0s. Then the first cell is established as the test cell. A 1 is written into the second cell and the first (test) cell is read. The second cell is restored to 0 and the test cell is read again. This is repeated for the test cell and every other cell. Every cell then becomes the test cell in turn. The entire process is repeated using complemented data. This is an N^2 test that is directed at write recovery type faults. It also detects faults that are detected by GALPAT.

Address Test writes a unique value into each memory location. Typically, this could be the address of that memory cell; that is, the value n is written into memory location n. After writing all memory locations, the data are read back. The purpose of this test is to check for address uniqueness. This algorithm requires that the number of bits in each memory word equal or exceed the number of address bits.

Moving Inversions test inverts a memory filled with 0s to 1s and conversely. After initially filling the memory with 0s, a word is read. Then a single bit is changed to a 1, and the word is read again. This is repeated until all bits in the word are set to 1 and then repeated for every word in memory. The operation is then reversed, setting bits to 0 and working from high memory to low memory. For a memory with n address bits the process is repeated n times. However, on each repetition, a different bit of the address is taken as the least significant bit for incrementing through all possible addresses. An overflow generates an end around carry so all addresses are generated but the method increments through addresses by 1s, 2s, 4s, and so on. For example, on the second time through, bit 1 (when regarding bit 0 as least significant bit, LSB) is treated as the LSB so all even addresses are generated out to the end of memory. After incrementing to address 111...110, the next address generated is address 000...001, and then all consecutive odd addresses are generated out to the end of memory. The pattern of memory address generation (read the addresses vertically) for the second iteration is as follows:

0000 . . . 1111

. . .

. . .

. . .

0000 . . . 1111

0011 . . . 0011

0101 . . . 0101

0000 . . . 1111

The Moving Inversions test pattern has 12BNlog2^N patterns, where B is the number

of bits in a memory word. It detects addressing failures and cell opens and shorts. It is also effective for checking access times.

To be continued...

We value your feedback!

Walking Pattern is similar to the GALPAT except that the test cell is read once and then all other cells are read. To create a Walking Pattern from the GALPAT program, omit the second read operation in the testbench. The Walking Pattern has an execution time proportional to 2N^2 . It checks memory for cell opens and shorts and address uniqueness.

March, like most of the algorithms, begins by writing a background of zeroes. Then it reads the data at the first location and writes a 1 to that address. It continues this read/write procedure sequentially with each address in memory. When the end of memory is reached, each cell is read and changed back to zero in reverse order. The test is then repeated using complemented data. Execution time is of order N. It can find cell opens, shorts, address uniqueness, and some cell interactions.

Galloping Diagonal is similar to GALPAT in that a 1 is moved through memory. However, it is moved diagonally, checking both row and column decoders simultaneously. It is of order 4N3 /2. Row and column GALPATs of order 4N^(3/2) also exist. Sliding Diagonal (see Figure) writes a complete diagonal of 1s against a background of 0s and then, after reading all memory cells, it shifts the diagonal horizontally. This continues until the diagonal of 1s has passed through all memory locations. The Diagonal test, of order N, will verify address uniqueness at a signifi-cant speed enhancement over the Walk or GALPAT.

Surround Read Disturb starts by creating a background of all 0s. Then, each cell in turn becomes the test cell. The test cell is complemented and the eight physically adjacent cells are repeatedly read. After a number of iterations the test cell is read to determine if it has been affected by the read of its neighbors. The operation is then repeated for a background of 1s. The intent is to find disturbances caused by adjacent cell operations. Execution time depends on the number of read cycles but is of the order N. Surround Write Disturb is identical to the Surround Read Disturb except that a write rather than a read is performed.

Write Recovery writes a background of 0s. Then the first cell is established as the test cell. A 1 is written into the second cell and the first (test) cell is read. The second cell is restored to 0 and the test cell is read again. This is repeated for the test cell and every other cell. Every cell then becomes the test cell in turn. The entire process is repeated using complemented data. This is an N^2 test that is directed at write recovery type faults. It also detects faults that are detected by GALPAT.

Address Test writes a unique value into each memory location. Typically, this could be the address of that memory cell; that is, the value n is written into memory location n. After writing all memory locations, the data are read back. The purpose of this test is to check for address uniqueness. This algorithm requires that the number of bits in each memory word equal or exceed the number of address bits.

Moving Inversions test inverts a memory filled with 0s to 1s and conversely. After initially filling the memory with 0s, a word is read. Then a single bit is changed to a 1, and the word is read again. This is repeated until all bits in the word are set to 1 and then repeated for every word in memory. The operation is then reversed, setting bits to 0 and working from high memory to low memory. For a memory with n address bits the process is repeated n times. However, on each repetition, a different bit of the address is taken as the least significant bit for incrementing through all possible addresses. An overflow generates an end around carry so all addresses are generated but the method increments through addresses by 1s, 2s, 4s, and so on. For example, on the second time through, bit 1 (when regarding bit 0 as least significant bit, LSB) is treated as the LSB so all even addresses are generated out to the end of memory. After incrementing to address 111...110, the next address generated is address 000...001, and then all consecutive odd addresses are generated out to the end of memory. The pattern of memory address generation (read the addresses vertically) for the second iteration is as follows:

0000 . . . 1111

. . .

. . .

. . .

0000 . . . 1111

0011 . . . 0011

0101 . . . 0101

0000 . . . 1111

The Moving Inversions test pattern has 12BNlog2^N patterns, where B is the number

of bits in a memory word. It detects addressing failures and cell opens and shorts. It is also effective for checking access times.

To be continued...

We value your feedback!

## Algorithms and High level models

For designs with significant control flow, algorithms can be described in software languages, flowcharts, abstract state machines, algorithmic state machines, etc.

For designs with trivial control flow (e.g. every parcel of input data undergoes the same computation), data-dependency graphs are a good way to describe the algorithm.

For designs with a small amount of control flow (e.g. a microprocessor, where a single decision is made based upon the opcode) a set of data-dependency graphs is often a good choice.

consumption, others performance, others data functionality. High-level models are used to estimate the most important design metrics very early in the design cycle. If power consumption is more important than performance, then you might write highlevel models that can predict the power consumption of different design choices, but which has no information about the number of clock cycles that a computation takes, or which predicts the latency inaccurately. Conversely, if performance is important, you might write clock-cycle accurate high-level models that do not contain any information about power consumption.

Conventionally, performance has been the primary design metric. Hence, high-level models that predict performance are more prevalent and more well understood than other types of high-level models. There are many research and entrepreneurial opportunities for people who can develop tools and/or languages for high-level models for estimating power, area, maximum clock speed, etc.

For designs with trivial control flow (e.g. every parcel of input data undergoes the same computation), data-dependency graphs are a good way to describe the algorithm.

For designs with a small amount of control flow (e.g. a microprocessor, where a single decision is made based upon the opcode) a set of data-dependency graphs is often a good choice.

- When creating an algorithmic description of your hardware design, think about how you can represent parallelism in the algorithmic notation that you are using, and how you can exploit parallelism to improve the performance of your design.
- Flow charts and various flavors of state machines, everything that you might 've learned about these forms of description are also applicable in hardware design. In addition, you can exploit parallelism in state machine design to create communicating finite state machines. A single complex state machine can be factored into multiple simple state machines that operate in parallel and communicate with each other.
- In software, the expression: (((((a + b) + c) + d) + e) + f) takes the same amount of time to execute as: (a + b) + (c + d) + (e + f). But, remember: hardware runs in parallel. In algorithmic descriptions, parentheses can guide parallel vs serial execution.
- Data dependency graphs capture algorithms of datapath-centric designs.Datapath-centric designs have few, if any, control decisions: every parcel of input data undergroes the same computation.

consumption, others performance, others data functionality. High-level models are used to estimate the most important design metrics very early in the design cycle. If power consumption is more important than performance, then you might write highlevel models that can predict the power consumption of different design choices, but which has no information about the number of clock cycles that a computation takes, or which predicts the latency inaccurately. Conversely, if performance is important, you might write clock-cycle accurate high-level models that do not contain any information about power consumption.

Conventionally, performance has been the primary design metric. Hence, high-level models that predict performance are more prevalent and more well understood than other types of high-level models. There are many research and entrepreneurial opportunities for people who can develop tools and/or languages for high-level models for estimating power, area, maximum clock speed, etc.

## Algorithms and High-Level Models

For designs with significant control flow, algorithms can be described in software languages, flowcharts, abstract state machines, algorithmic state machines, etc.

For designs with trivial control flow (e.g. every parcel of input data undergoes the same computation), data-dependency graphs (section 2.4.2) are a good way to describe the algorithm.

For designs with a small amount of control flow (e.g. a microprocessor, where a single decision is made based upon the opcode) a set of data-dependency graphs is often a good choice.

When creating an algorithmic description of your hardware design, think about how you can represent parallelism in the algorithmic notation that you are using, and how you can exploit parallelism to improve the performance of your design.

Flow Charts and State Machines:

Flow charts and various flavours of state machines are covered well in many courses. Generally everything that you've learned about these forms of description are also applicable in hardware design. In addition, you can exploit parallelism in state machine design to create communicating finite state machines. A single complex state machine can be factored into multiple simple state machines that operate in parallel and communicate with each other.

In software, the expression: (((((a + b) + c) + d) + e) + f) takes the same amount of time to execute as: (a + b) + (c + d) + (e + f).

But, remember: hardware runs in parallel. In algorithmic descriptions, parentheses can guide parallel vs serial execution. In hardware, (((((a + b) + c) + d) + e) + f) takes 5 adders with a depth of 5 for result, and using (a + b) + (c + d) + (e + f) also takes 5 adders but just a depth of 3 to compute the result.

Datadependency graphs capture algorithms of datapath-centric designs. Datapath-centric designs have few, if any, control decisions: every parcel of input data undergoes the same computation.

High-Level Models:

There are many different types of high-level models, depending upon the purpose of the model and the characteristics of the design that the model describes. Some models may capture power consumption, others performance, others data functionality.

High-level models are used to estimate the most important design metrics very early in the design cycle. If power consumption is more important that performance, then you might write highlevel models that can predict the power consumption of different design choices, but which has no information about the number of clock cycles that a computation takes, or which predicts the latency inaccurately.

Conversely, if performance is important, you might write clock-cycle accurate high-level models that do not contain any information about power consumption. Conventionally, performance has been the primary design metric. Hence, high-level models that predict performance are more prevalent and more well understood than other types of high-level models. There are many research and entrepreneurial opportunities for people who can develop tools and/or languages for high-level models for estimating power, area, maximum clock speed, etc.

For designs with trivial control flow (e.g. every parcel of input data undergoes the same computation), data-dependency graphs (section 2.4.2) are a good way to describe the algorithm.

For designs with a small amount of control flow (e.g. a microprocessor, where a single decision is made based upon the opcode) a set of data-dependency graphs is often a good choice.

When creating an algorithmic description of your hardware design, think about how you can represent parallelism in the algorithmic notation that you are using, and how you can exploit parallelism to improve the performance of your design.

Flow Charts and State Machines:

Flow charts and various flavours of state machines are covered well in many courses. Generally everything that you've learned about these forms of description are also applicable in hardware design. In addition, you can exploit parallelism in state machine design to create communicating finite state machines. A single complex state machine can be factored into multiple simple state machines that operate in parallel and communicate with each other.

In software, the expression: (((((a + b) + c) + d) + e) + f) takes the same amount of time to execute as: (a + b) + (c + d) + (e + f).

But, remember: hardware runs in parallel. In algorithmic descriptions, parentheses can guide parallel vs serial execution. In hardware, (((((a + b) + c) + d) + e) + f) takes 5 adders with a depth of 5 for result, and using (a + b) + (c + d) + (e + f) also takes 5 adders but just a depth of 3 to compute the result.

Datadependency graphs capture algorithms of datapath-centric designs. Datapath-centric designs have few, if any, control decisions: every parcel of input data undergoes the same computation.

High-Level Models:

There are many different types of high-level models, depending upon the purpose of the model and the characteristics of the design that the model describes. Some models may capture power consumption, others performance, others data functionality.

High-level models are used to estimate the most important design metrics very early in the design cycle. If power consumption is more important that performance, then you might write highlevel models that can predict the power consumption of different design choices, but which has no information about the number of clock cycles that a computation takes, or which predicts the latency inaccurately.

Conversely, if performance is important, you might write clock-cycle accurate high-level models that do not contain any information about power consumption. Conventionally, performance has been the primary design metric. Hence, high-level models that predict performance are more prevalent and more well understood than other types of high-level models. There are many research and entrepreneurial opportunities for people who can develop tools and/or languages for high-level models for estimating power, area, maximum clock speed, etc.