@betajet: A programming language produces a new executable(,exe). The existing tools do not compile a new executable either, so to hopefully clarify:
A program language compiler itself does not change, but generates object code that must run on a computer.
"compile" as applied to an HDL generates a netlist that is then processed by other programs in the tool chain. They do not generate executables -- HLS??
To me HDL represents more the physical circuits and not the logical function. Synthesis is limited by the ability of humans to think of unique logic functions that would generate the particular HDL.
An example of formatted data is a spreadsheet that can manipulate input without a compile step. A program could be written to do the same thing, but it would have to be compiled then run.
Anyway, the existing tool chains based on HDLs start in the middle of the design process and essentially ignore the functional/logical aspects. That makes them poor for debug. They assume a compled design and are nearly useless for debug and design iterations. And generate all kinds of messages about physical rather than functional things.
What I suggested is a program to take input that defines the data-flow and the logic to control the gating of registers and memories. (just like the old days when there was logic design and data-flow design then humans would transcribe the logic into machine readable format for automated wiring). NO, NO, NO I really do not mean do it manually let a program tie things together and do the transcription. Just do what the existing tools fail to do.
RTL Register Transfer LEVEL (of abstraction, not language). Hardware Description Language i.e. Verilog VHDL.
HDLs should be the result of logic design, NOT used for design input. HOWEVER Verilog was defined for circuit level simulation. THEN came synthesis to infer logic that would create the same circuits and is limited by the available set of logic functions.
There are no such limits for logic design functions.
Worse yet there are no logic design tools, only simulators that require HDL compilation first which takes unacceptable amount of time. (ModelSim can model HDL, but generally a netlist generated by compile is used.
Programming and Logic Design are totally different. Programming manipulates data sequencially. Hardware gates registers to data flow for data manipulation. There must be sufficient time for the data to stabilize before capturing the result. This time is of no concern to the programmer.
Every hardware register must not change while the contents are being used. This would apply to dataflow programming also.
Hardware consists of data-flow and control logic. Programming is control flow and variables.
We don't need another programming language, rather a tool that accepts text to define registers, memories, and logic functions. The key is to complete the logic design before creating HDL(RTL).
@Matthieu: Agreed "tools too slow" as you said. Seems they also think configuration time of FPGA is too long compared to loading memory.
I think Cx approach is more like what logic design used to be. (design first then synthesize) It just seems better to do the design rather than use "inference" to synthesize and live with what you get.
There is still the point of just loading memory being so much faster than synthesis. Micro-code is still beibg used in CPUs and it also can be used for hardware control as a performance compromise.
I will have to think more about whether it can complement Cx.
@KarlS01: what I meant was that both approaches are comparable in being at a higher level than RTL, but lower level than C++. Now you're right that the technologies are not alike, Cx is for hardware design, whereas C++ is still used in that case to write software (even if that software runs on soft cores)
What I had understood by "tools are too slow" was that they were talking about synthesis, place and route tools, etc. Which could be a good reason for using softcores, since you only have to synthesize once, and then you write software that runs on the softcore. My guess is that they used FPGAs to create their own parallel processing platform suited to their needs. I think they could have used a many-core processor (like Adapteva's Epiphany) for that matter, but it did not yet exist when they started the project. And perhaps it did not meet their needs either, after all they looked at GPU first before deciding on FPGA.
@ Matthieu: "they had to come up with "a middle ground between C++ and RTL where people can program". Interesting, it sounds a lot like Cx"
If Cx produces HDL that must be processed by the tool chain and results in configuring the FPGA and they compile C++ to generate memory content, how can they be alike?
They also stated that it takes too long to reconfigure and the tools are too slow, so it must be they found a fast way to execute C++ source. And they they run many threads in parallel so the threads are probably independent since there is no mention of an OS to synchronize the execution.
@Swan: "For whatever reason Microsoft chose to implement many-core SOFT processors on the FPGA's, so the FPGA's are still running SW, albeit multi-threaded."(parallel multi-threaded)
They stated that it is faster to just re-load memories than to reconfigure the FPGA.
The soft cores run at lower clock frequency, but still accelerate the application. There are many soft cpu's available, but they designed a soft core -- why? traditional CPUs are memory bound in that there is so much load and store overhead to get the operands into registers and to put the result back into memory. Just streaming the FFE into local memory and storing the result of the algorithm is important. Doing many threads in parallel is even better. Compiling C++ source for a custom soft core without a classical ISA is another plus.
FPGA design is totally different than SW programming. Data is processed as it moves thru the FPGA dataflow. SW selects operands and operators repetitively to do processing which requires a processing step for each operator while multiple operators can be evaluated per step in HW.
Also, OpenCL is mostly matrix manipulation so it may allow matrix algorithms to be programmed in C++ it is most likely a subset and probably not the subset needed for FFE evaluation.
I am fully aware of what RTL stands for and what it is used for, namely designing hardware circuits at a low level of abstraction. The thing is that it doesn't have to be this way: you can actually write code structured as tasks running in parallel, each one performing computations described as sequential C-like code. In fact, that's what we're doing at Synflow, we've created a new programming language called Cx that actually makes hardware design fun again! And yes, I believe that RTL kind of sucks, which is a good reason why Microsoft ended up designing softcores to program them in C++ on FPGA lol
The algorithms may change, but that doesn't mean that the processor architecture must change.
FPGAs are expensive solutions to mass production systems. An adequate ASP, this is engineering after all, could be substantially less expensive than the FPGAs and also substantially reduce the data center power consumption. They would also be capable of higher clock speeds than an FPGA.