FPGA for LDW
Lane Departure Warning System Capitalising on Advantages of FPGAs
Drowsy and distracted drivers often veer outside of their lane when driving on the motorway. A driver assistance research study by INVENT-FAS reports that 28 percent of accidents were caused by lane merging or crossing manoeuvres - 15 percent were road departures.
How can technology be used to give drivers warning that they're going off course and help prevent many of these accidents? The best way is to use the same indicators that drivers use - the lane markings. All motorways in Europe and most in the US have painted lane markings, so why not point a camera out of the front of the car to find them? Real-time high integrity image processing used to be the preserve of the military - requiring huge, expensive racks of equipment to perform the complex algorithms required. Some of our early developments of this system required a car full of electronics!
However, times are changing: a technology referred to as a Lane Departure Warning system is now in production on mid-range vehicles. This is a small, low-cost and low-power consumption technology targeted at mass consumers.
Conekt has been working with Driver Assistance Systems since the 1960s (originally as Lucas Industries). At this time, radar was first used to control the acceleration, or longitudinal behaviour, of a vehicle. In the 1990s, work began on using image processing of forward-facing cameras to control the lateral behaviour of the car, with the Prometheus programme providing a major demonstration vehicle of a car which could be driven ‘no-hands, no-feet’ on the motorway.
Since the Prometheus demonstrator, further development has enabled TRW to produce a system that can now be fitted to production vehicles to provide the driver with a ‘lane-support’ function, in which the electric power steering system gives feedback to the driver when they are too close to or crossing the lane markings.
The lane departure warning system
The system is split into several parts:
Image data is taken from the camera into a series of low-level processing blocks (the subject of this article). The output of this is a collection of ‘interesting-looking features’ about the scene ahead, which the high-level processing (implemented in a traditional microcontroller) uses to describe the position of the car on the road to the steering algorithm. This then sends a command over the controller-area network (CAN) bus to the steering system. The driver receives warnings from the system in the form of intuitive movements of the steering wheel known as haptic feedback - guiding the driver back to the centre of the lane.
Low level processing
Merging at motorway junctions
In order to provide the robust sensing that an active system requires, Conekt has developed algorithms that operate over the entire road surface in front of the vehicle - making use of all the data available and not just a small window of interest. In addition, wide-screen format video sensing chips enable a view of the lanes to either side of the vehicle. Currently, many available systems are only able to provide a description of the road for a few metres in front of the vehicle, assuming it to be straight over that distance. Conekt's algorithms are able to estimate the curvature of the road up to 50 metres ahead.
The production system uses a high dynamic range wide-VGA sensor, with a resolution of around 0.3 million pixels. This does not sound much compared with today's digital cameras, but the sensor provides data at up to 60 frames per second, producing a raw data rate in the region of 100-200 Mbits/sec. Processing all this data is a challenge. The sensor also provides a high dynamic range enabling the system to operate even when the sun is low and shining directly at the sensor, and in night-time conditions. Each pixel is represented by 10 bits, rather than the conventional 8 bits, which can cause overhead in processing by conventional approaches.
Performs well - even in the rain
The traditional approach to this kind of problem is to use one (or more!) DSP devices, or to design a custom ASIC.
During development, it was estimated that between 400 and 800 million operations per second (MOPS) would be required, depending on the DSP architecture. There are DSPs available that are suitable for use in an automotive environment; however extracting that level of performance from the device is a challenging task, requiring low-level assembly language coding, or non-portable C-language extensions. In addition, it was likely that the current algorithms would heavily load the serial DSP processor, leaving no room for further development during the product engineering process. Also there would be no scalability to add new features based on the camera sensor (eg intelligent headlamp control).
Another solution would be to develop an ASIC to perform this task. ASICs have the potential for very low costs, very high performance and very low power consumption. However, developing a custom ASIC is a very expensive proposition in terms of non recoverable expenditure , and this product is at an early point in its lifecycle. Even during the product-development phase of the project, extra features will be added to the image processing, which can be accommodated in a flexible architecture, but once an ASIC has been committed to production it is very expensive to add more features. Until the volumes reach the millions of units mark, the costs of programmable approaches will be perfectly reasonable for this project. For this reason, this article will reject the ASIC approach at this point!
Enter the FPGA
Operation at night is not a problem
After some detailed studies regarding the cost and performance trade-offs of different image processing architectures, we concluded that using an FPGA for part of the task provides us with the right balance for this application. We chose Xilinx because of its commitment to the automotive market through their XA grade parts program. The Spartan-3E family was a natural fit based on the cost-targets and performance needs of the application. Overall, the flexibility of custom programmable logic coupled with the cost-effective and high-performance nature of the FPGA has allowed us to produce an efficient solution for this application.
Xilinx's Spartan-3E FPGAs provide extensive logic in the form of 4-input look-up-tables (LUTs) and flipflops (FFs). Extra functionality is provided in the form of embedded multipliers, and dual-port RAMs. The balance of logic, memory and other functionality provides an ideal flexible platform for this solution.
Careful design pays off
Due to the design of Conekt's algorithms, they map very well to the FPGA architecture. Conekt engineers designed a pipelined processing approach that eliminated the need for an image frame buffer, thereby reducing material costs and board area. Our extensive experience in this field enabled the prototype designs from earlier iterations to be ported to the production intent FPGA very quickly. The fact that the code used was designed for FPGA rather than an ASIC meant that the porting exercise was much quicker.
Algorithms for operation in non-urban
scenarios are in development
It is interesting to compare the Conekt approach with a ‘traditional image processing building blocks’ approach.
A traditional system has been prototyped consisting of an edge- detector/thresholding/dilation operator, targeting an 8-bit image. The techniques used were not designed specifically for lane-marking detection purposes. This results in an image that requires further post-processing before being suitable for lane-detection. This solution uses around 1800 LUT/FFs and 11 RAM blocks. It can process around 15 frames per second (fps).
The approach taken by Conekt was an integrated one, designed for the 10-bit image, and producing a limited and focussed set of features of interest - minimising the resources needed within the FPGA, which required only around 1000 LUT/FFs, and no RAM blocks. It can potentially process 250 frames per second, which means that the pixel-processing engine is no longer the processing bottleneck, as the DSP would have been. The post-processing required by the host microcontroller is also much smaller in this approach.
On top of this utilisation, there remain the interfaces to camera and host-processor and diagnostics, which could be expected to be similar between the two designs, at around 1250 LUT/FFs, and 7 RAM blocks.
The total RAM usage, and the LUT count, of the traditional design would mean a larger device would be required compared to TRW's approach. It is clear that with carefully designed algorithms, smaller and lower cost FPGAs can be used.
Neither the FPGA nor DSP are suitable for running traditional portable C-code directly for image-processing tasks. Specialist engineers are required to make the most of these devices. In the FPGA, the complete code currently requires around 50% of the resources of the device, which leave plenty of room for expansion. In addition, it is capable of processing the image at up to 250 frames per second (fps). The DSP would be very highly utilised, even at the minimum acceptable frame rate of 30 fps.
The FPGA (despite preconceptions) turned out to be lower cost than the DSP. Both devices require a host microcontroller to perform the rest of the processing, and the DSP is likely to need an external SRAM device for code and data storage, increasing both cost and size. As the FPGA operates in parallel, it only needs to be clocked at the camera pixel clock rate (which is of the order of 25MHz). A DSP would need a clock of several hundred MHz, increasing the EMI problems of the design. Lower clock rates also tend to produce lower power consumption.
A further benefit of the FPGA design is that a diagnostics and debugging output can be generated for no incremental cost. This allows development engineers to log the image data from trials ‘on-the-road’, along with the results of the FPGA processing and further processing by the microcontroller, all of which are synchronised. This would be impossible with a DSP-based system.
Due to its parallelism, the FPGA is completely deterministic in its behaviour, so no matter how complex the scene, it will always transmit its data to the host microcontroller starting at the same time and taking up the same amount of time. A DSP-based system will vary in its processing time depending on the complexity of the scene.
Both FPGA and DSPs can interface directly to the parallel output of the image. However, the FPGA can implement a serial LVDS interface directly, reducing the pin count between the imager and the FPGA. This is important if the imager is to be mounted on a standoff board, or if a flexible PCB is used, to minimise the number of interconnections required, improving reliability. Finally, the FPGA offers the future possibility of integrating the host microcontroller to produce a true system-on-a-chip solution.
For every problem, there are many solutions. The use of FPGA processing power in this Lane Departure Warning application has enabled Conekt to develop a high-integrity, low-cost automotive image processing system for Driver Assistance applications. This has been achieved through depth of knowledge in FPGA design and innovative implementation of the algorithms in an FPGA architecture.
For more information about our FPGA design services, please contact us.