Outreach: high school students.

I visited a girls school last month for their school careers fair to talk about PhD’s, NICTA and computing. If even one student thinks about enrolling in CS, it was worth it. This is the talk I gave to them.

My PHD study is in computer science. Day to day I read cool things that other people have done in my field, and design computer programs. In my lap we are working on unhackable helicopters that can fly themselves. These have all sorts of applications – from military and civillian surveillance,  to exploring disaster zones to search and locate people and check for hazards before real people enter the scene.

 

But I mainly wanted to talk to you about how I got to where I am today.

 

For my HSC I took 4 unit maths, physics, english, IPT and visual art. I loved all of my subjects except IPT. Guess what my best mark was in – it definitely wasn’t physics. It was art. I did pretty well in my other subjects too – but I put the most effort into my major work.

 

I took a gap year and went to sweden on a rotary exchange, which was really fun. You can do exchanges while at university too and I recommend you look into it – travel only gets harder as you get older and have more things tying you down.

 

I got back from sweden and enrolled in a telecommunications engineering degree. I wanted to do graphic design, but my maths teacher said I was too good at maths, and engineering was creative too.

 

But in my first year of uni I discovered two things. The first was programming. I had never, ever written a program before, but we had to take a computer science course for first year engineering. I loved it. I looked forward to my programming courses, did my homework and assignments early and even practiced writing programs when I didn’t have any assignments. Although it doesn’t involve drawing, it’s like creatively putting a puzzle together. At the end you come out with some software that you made.  The result was that I changed from telecommunications engineering to a combined degree in telecommunications and computer science – so I could do more programming.

 

The second thing I learned was that I hated physics. While physics was fun in high school it was different at uni and I just didn’t have fun doing it anymore. I also missed writing and reading, which you don’t do much of in maths and programming classes.

 

So I changed my degree again – I swapped out the telecommunications for an arts degree. I started off taking english classes, but eventually fell in love with philosophy and got a major in that.

 

So that’s me. Two degree changes and I finally ended up with a combined degree in computer science and arts, with a major in philosophy.

People were naturally confused. What does philosophy have to do with computers? Why are you taking an extra year at uni just to do arts — aren’t arts degrees pointless?

 

The answer is no. Arts degrees aren’t pointless. If you love to write and argue, an arts degree perfects your writing and arguing. You read amazing writing of all of the smart people who came before you, and learn to criticise it and tell them they were wrong.

 

And guess what? Being a scientist or engineer who can write and argue well only makes you a far better scientist and engineer. When I did my honours thesis, I got excellent marks not just because I did good science, but because I knew how to write about it well – something that many engineering students never learn to do.

 

So here are my points: Picking a degree at university isn’t final, its just a first step. Do what you love. You might find that changes. Follow what you love to do. A career is a thing you will do for 40 hours a week for 40 years. It better be something that can get you out of bed every morning.

 

I also did a lot of internships during my degree. I worked at downer engineering in first year. at the end of second year I applied for a job at the software company Atlassian and worked there for two years – full time over summer and part-time during semester.

 

One summer late in my degree I went on an internship with Microsoft.

 

I got to go to seattle all winter, and get paid a lot. So much that I could afford to go skiing in canada 3 times during that winter, then go to new york for a week at the end of my internship. Working at microsoft was fun and it paid really well. At the end of the internship they offered me a job and a ridiculous pay packet.

 

I turned them down. Although Microsoft was fun, I didn’t feel like I was contributing much to society by writing software for people to buy to make Microsoft money.

 

And that’s why I’m doing a phd. I want to invent something new, and push at the boundaries of human knowledge. It will be a very small push. But I think its worth it, and it makes me excited to do my work every day.

 

The last point I want to make is that with computer science you can do anything. You can start out having no idea what to do and then go work at a bank and make lots of money, work at google on whatever you want, do a PhD, you can write websites or program video games or start your own business. Software is everywhere.

 

I’d love to be able to reach out to more schools, and not just all girl schools in the centre of the city. Nationally we have a huge shortage in computing enrollments, and a very small number of those enrollments are female.

But I have a PhD to work on. If everyone takes a minor role of reaching out to school students we can advertise our industry, and smash the typical stereotypes that make computing less attractive.

Advertisements

What is real-time?

A note on correctness

Determining correctness in computing can be difficult, even when it appears trivial.

Consider a calculator application on your PC, laptop, tablet or phone. The calculator application has a lot of different functionality, but let’s take just one piece. The add function. How do we tell if it is correct?

Can we define it as “1 + 1 = 2, 1 + 2 = 3, 1 + 3 = 4… ” ? Clearly, that approach is never going to end. Instead we state it like mathematicians, and say “For all possible a’s and all possible b’s, a + b = a + b.”  But applying this correctness test to the add function on the calculator on my PC actually fails.

Why? Once the numbers get too big, the calculator cannot add them any more, and fails with the error “Invalid input” or “Overflow”. This is because the calculator has a limit on the size of numbers it can use. Finally our definition of correctness for the add function becomes “For all possible a’s and all possible b’s, a + b = a + b, where a + b is smaller than or equal to the biggest number that the calculator can represent.”

Real-time correctness

Imagine a scenario where your computer is running slowly. Clicking on buttons doesn’t result in an immediate response. You have the calculator open and are using it to add numbers, but it takes a few minutes to show the result. The add function is still correct according to our definition earlier, but it would be faster to use pen and paper.

What if we modified the definition again to include time? We could easily add “… and finishes in under a second”. This would be extreme for a basic calculator application. However, if the calculator is being used to perform altitude adjustments to keep a plane in the air? Determining correctness based on timing is essential.

This is exactly what real-time computing is – a system where correctness depends not only on its outputs – but the time at which they are produced.

Types of real-time

In real-time computing, we talk about systems with deadlines. In the calculator example, the deadline would be 1 second after the add button is pressed. There many different classifications of real-time systems, which vary by the strictness of the timing constraints:

Type of real-time Timing constraint
Hard If a deadline is missed, the system is incorrect.
Firm If a more than a predefined % of deadlines are missed, the system is incorrect.
Soft Definitions very – but generally, if the average lateness is beyond a defined threshold, the system is incorrect
Best-effort No deadlines, just as much time as can be allocated. Correctness does not depend on time.

A basic calculator application is best-effort (although whether it should be is a different question). An example of a hard real-time system is the air-bag deployment software in a car.

If the air-bag isn’t triggered in time, the system is not only incorrect – there is a high risk of fatal injury. Soft real-time systems include media players, where quality will decrease if deadlines are not all met – but the system isn’t incorrect until too many deadlines are missed.

Enter mixed-criticality

Intuitively, the strictness of the real-time system appears to line-up with the criticality of the system. Air-bag systems are obviously at a catastrophic criticality level as fatal injury is likely on failure. Media systems failing will annoy users, and will fit into the minor criticality classification. However, the relation between criticality and real-time strictness is not that clear.

Mixed-criticality systems also mix types of real-time. A recent paper [1] looks at autonomous helicopters, which have (at least – this is simplified) these different tasks:

Task Description Real-time model Criticality
Flying responding to sensors and keeping the helicopter in the air and balanced. Hard Catastrophic
Navigation Tracking objects to avoiding obstacles, or follow targets Soft Major
Reporting Streaming video back to base. Soft Minor

The interesting feature of this system is the navigation task. This task takes longer to execute the more objects it has to track. But it’s still fairly critical — the system is incorrect if the helicopter navigates into a wall. But has to be treated as soft real-time, as there is no way to know how much time this task needs. This is exactly what makes mixed-criticality systems interesting. Also, it’s a helicopter :).

[1] De Niz et. al. On resource overbooking in an unmanned arial vehicle., IEEE/ACM International Conference on Cyber-Physical Systems – ICCPS , 2012.

Mixed-Criticality and WCET: a tale of hardware utilisation

In the last post I talked about why mixed-criticality systems — systems where software applications of different criticalities run on the same physical hardware — are important. In this post, I’ll talk about why this can lead to greater use of the hardware overall.

This sounds counter intuitive. All mixed-criticality means is taking a few small pieces of hardware with separate, critical applications running on them and running them on a the one piece of hardware. The end result is the same use with less hardware. Right?

The answer is yes. But we can do better.

Mixed-criticality in planes

Some commercial aircraft use mixed-criticality systems. The flight control system and the autopilot system share the same hardware. Clearly, the flight control system is much more critical than the autopilot. A failure in the autopilot would be bad, but the pilot can take control. But if a failure occurred in the flight control system — well, hopefully everyone wouldn’t die.

Current practise is to work out how much time each application needs to do its job, and hand out parts of processing time to each application. We can determine that the autopilot needs 40% of the total processing time, and the flight control requires the other 60%. How do you share the hardware in this way? Just define a fixed amont of time, say 100 milliseconds (ms). You execute the flight control software for 60ms, then the autopilot for 40ms, then the flight control software …

Worst case execution time

But how do you calculate that one piece of software needs 40% of processing time to operate correctly?

The answer is to do worst-case execution time (WCET) analysis. The point is to work out the longest that a piece of software could take to execute. A basic way to do this it to add up all of the instructions in a program to see how long they can take. For every path that the software could execute, always take the longest one.

However, due to the complexity of modern hardware, the time a single instruction takes to execute is not consistent. But we can’t risk not having enough time for our critical software to run. This means we have very pessimistic WCET estimates – which may be several orders of magnitude greater than a worst case execution time obtained by measurement.

Obtaining precise WCET measurements is an ongoing field of research.

Back to the flight control example

The flight control system requires 60% of total processing time based on the computed WCET. The same applies for the autopilot. But in reality, neither application uses this much time. They might, but the probability is incredibly low. We just can’t calculate a better estimate.

Recall that WCET estimates can be orders of magnitude worse than general behaviour. The table below shows how this could impact the estimates of how long our flight control software will take to run:

WCET 1 order of magnitude 2 orders of magnitude
60ms 6ms 0.6ms
40ms 4ms 0.4ms

Given that WCET estimates are conservative, and pessimistic, the actual time used on average by the flight control software could be just 6ms. Or 0.6ms (two orders of magnitude). If the auto pilot is just using 4ms, then together both applications only use 10% of the available processing time.

Even though this is already a mixed-criticality system, we could be spending 90% of our time with idle hardware.

What could we do with that other 90% of time? Add more software applications – with lower criticalities. Applications where a failure isn’t serious or noticeable, so that just-in-case the flight-control system uses the entire WCET, it can.  And that is exactly how using mixed-criticality systems can improve  overall usage of the hardware we have.

The rise of mixed-criticality systems

Criticality

Errors occur in software on a daily basis. Not just on our laptops and desktop computers, but in embedded devices everywhere. An app on your phone crashes. Visible errors appear on a screen at a railway station (often BSODs).

city-rail

While these errors are annoying, they don’t have a high impact on our lives. Safety-critical systems are the complete opposite. In a safety-critical system, an error risks lives and signficant damage to the environment. In aviation, the DO-178B standard has 5 criticality levels for ranking failures, which classify the impact on safety (the plane), the crew and passengers:

  • Catastrophic: failure may cause the plane to crash.
  • Hazardous: failure greatly reduces safety, causes potentially fatal injuries to passengers, or puts crew at risk of not operating the aircraft properly.
  • Major: failure has an impact on safety, causes potential (non-fatal) injuries to passengers or increases crew workload.
  • Minor: failure is noticeable and causes passenger inconvenience (but not injury).
  • No effect: failure has no impact on safety, crew or passengers.

Software systems in the plane are then classified according to their criticality. The higher the criticality (catastrophic is the highest), the more strict the development processes that are required before the software can be deployed. High criticality processes require certification by independent bodies.

Applications with different criticalities – especially the higher rankings – were traditionally physically isolated. Different hardware and cables for systems of different criticalities. This was for cost and safety reasons. The higher the criticality of a piece of software, the more time and effort required in order to use it. You don’t want to incorporate functionality that is rated minor with functionality that is rated catastrophic, or you will have to apply the certification process for the catastrophic rating to the entire application. This quickly gets expensive.

The cost reason then facilitates the safety reason. If you don’t want to certify a minor criticality functionality to the level of a catastrophic one, it must be isolated from the catastrophic one. Otherwise the minor application could have errors and bugs that cause the catastrophic one to fail.

All of this makes sense. But it’s wrong. Physical hardware isolation is no longer practical. Enter mixed-criticality systems.

Mixed-criticality systems

A mixed-criticality system is that runs applications of different criticalities on the same physical hardware. The physical isolation of yesterday is lifted to the operating system – a very small, privileged, trusted piece of code that guarantees that applications cannot cause each other to fail. By certifying the operating system the obligation to do rigorous certification on low-criticality applications disappears.

There are several motivations for building mixed-criticality systems. The first is all about weight. Three factors have led to a massive growth in the amount of processors in cars and planes:

  • Software is flexible: a standard computer can do anything. To change what it does, you change the software. If the software is connected to a network, this can be done remotely. Unlike custom hardware and circuits, which need to be called back to the manufacturer for physical changes.
  • Hardware breaks: hardware failures, even in specialised hardware, are a possibility. As a result, safety-critical systems have back-up processors just in case.
  • Software is useful: we do more with software than ever before. Just compare a smart phone with one of those 90’s bricks. List all of the functionality and count how much more the smartphone does.

As a result, safety-critical systems have a huge amount of processors. Cars have nearly 100. Isolation only makes this worse. Isolating tasks of different criticality means duplicating hardware and all of the cables that go with it. For a plane, the sheer weight is a huge cost in terms of fuel.

The “internet of things” includes your pacemaker

Another factor driving the demand for mixed-criticality systems is the internet. Specifically the internet of things. Consider the hypothetical development of a pacemaker (because I don’t know much about real ones). Early pacemakers were probably hard-coded circuits. They couldn’t be updated without traumatic surgery to take it out and put a new one in. Instead of taking the risk, old pacemakers are left in.

Time moves on. The pacemaker is now just a chip with some special hardware to regulate the heart. Why can’t we put a web-server on there? It would allow doctors to download reports of your heart rate over time. A pacemaker connected to the internet could allow you to look at them and monitor your own health. They could even contact your doctor for you if something goes wrong.

This is a mixed-criticality system. A web-server is complex software that would cost way too much to certify. In no way should the web-server be able to interfere with pace-maker functionality due to errors (it could kill you), otherwise hackers could control your heart (they could kill you). The system requires isolation, with controlled information flowing between the web-server and the pace maker (one way).

Now think of all the other critical devices that could be part of the internet of things.

Image credit goes to hellofish