Monday, September 9, 2013

#27 - Towards a Ph.D. Thesis Proposal: NextGen Project and Optimization Strategy

With about 7 months remaining until a target graduation date, the first month has to entail projection and proposal.  The next three months from here til the end of year entail the core work and research on that proposal.  The final four months focus on developing the dissertation culminating that core work as well as attributing work done previously.  When I'm at a loss of ideas, I find that writing them out sometimes help them come forward.  It's like that in creative writing: don't think about what to write, just let the words come out.  Creativity and ideology live in the brain; writing can be an effective way for them to channel outward.

"What are my birds?" is asked by my advisor.  I have an algorithm called GALE, which can perform multi-objective optimization in very few evaluations.  This sounds really cool, but this is purely at this point, an algorithmic state of affairs.  Who cares?  What about the application of such an algorithm and its attributes to bettering the world?  This is the algorithms vs applications conflict.  Many theses and topics of research live purely in the world of algorithms, but lately a push for application-world research is called for.

GALE can optimize stuff, and learn stuff in few evaluations of the model.  If I want a thesis out of this, I need to find a way to tie the importance of this into the applications world.  Why would doing things faster be a good thing?  The only thing I can think of is for the affairs of safety-critical devices where speed is a requirement.  Secondly, what does it mean to learn stuff in few evaluations, blah, blah, really need to jump away from algorithmic-speak.

What The Birds Are


Overall, I have about four things in general that can be called my birds here.  First and foremost I have an algorithm called GALE.  This algorithm is a tool for software engineering researchers and intelligent systems design.  They key here, is that GALE can be used to aid in the development of systems that can analyze its environment and make expert decisions very quickly.  The obvious bit here is that it might be highly critical of the system to make those decisions quickly, e.g. consider systems where decisions affect the safety of human lives.  Furthermore, if the system cannot make those decisions quick enough, say, to react to very unexpected and sudden environment changes, then the safety of lives might also be endangered.

The the algorithmic world which GALE lives in, it needs a simulation to study as its environment.  In the application world, this simulation becomes the environment through machine learning.  This transition is a long way off, but the connection between optimization studies (like with GALE) and machine learning (using optimization to learn) is slowly becoming stronger every day.  It might be sensible to believe that one day in the future these two fields would merge and become one.

For now, we use GALE with a simulation in the algorithm world.  The first study on GALE was performed on a simulation called POM3, in which the process of completing a software project is modeled through requirements engineering (i.e. how best to plan a strategy of completing tasks to the project).  POM3 as a simulation has a handful of decisions that can be made, as well as a set of objectives to optimize in decision making.  Remember that anytime a decision is made, the critical-thinking process here is an aim to optimize some goal - e.g. what type of car should I buy?  We have goals of minimizing cost, maximizing MPG, maximizing aesthetic appeal, etc.)  For POM3, these goals were Completion (percentage of tasks that were completed, because a lot of projects only get so far before termination), Cost (total money spent), and Idle Rate (how often developers were waiting around on other teams).  The decisions were things like team size, size of the project, and other more domain-specific decisions to software engineering.

Going back to developing a thesis proposal; POM3 doesn't really fit our needs.  We'd like a simulation that can be optimized with GALE but has some application-world use where the power of making decisions quickly is very important.  POM3 isn't really safety-critical at all.  While it was a good and meaty simulation with many decisions and objectives, it just won't cut it for proposal in a thesis that wants to live in the application world.

The NASA Birds


My last two birds are two projects from my work out in California with NASA Ames Research Center.  These two projects involve aerospace research, and while one is a simulation, the other is a tool for ensuring safety of flight.  So it sounds like right away, we might have tools on hand for a thesis if we can combine the two projects.  WMC (Work models that compute) is the simulation, while TTSAFE (Terminal Tactical Safety Assurance Flight Evaluation) is the tool for conflict detection of aircraft - to make sure they don't collide in airspace.  Sounds trivial, but there's problems that need to be kept contained.

TTSAFE merely examines an input file containing codes that deal with aircraft locations, flight plans, tracking data, velocities, altitudes, and more.  This input file feeds into TTSAFE and a conflict detection algorithm determines if there are multiple aircraft heading towards each other on a collision course.  There are three main parameters here for such an algorithm.  The first is how often TTSAFE checks airspace for conflicts (granularity).  Secondly, to identify conflicts, a line can basically be drawn starting from every aircraft in airspace, and extending along the path of its velocity and direction.  If there are any two lines that intersect, then there is a conflict between the two aircraft of those two lines.  The second parameter here is how long to drawn the lines, e.g. 3 nautical miles, 10 nautical miles (or perhaps measured in time).  Thirdly, lines may not need to intersect to conflict, but instead merely come close to each other.  So the third parameter is the safe radius around the aircraft.

Once TTSAFE identifies conflicts, it proposes resolutions to those conflicts, and flight plans are adjusted based on rules of right-of-way in airspace.  There are problems with such a conflict detection algorithm because not all identified conflicts are truly a real conflict.  For instance, some aircraft may be on their way in making a turn, so while the velocity and current directions extend a line that intersect that of another aircraft, there would never have been a conflict because the aircraft was in the process of making a turn that "tricked" TTSAFE into believing there'd be a conflict.  Nevertheless, such a False Alarm is taken seriously and the aircraft is signaled to adjust flight plan - lengthening its miles traveled; costs more; takes longer to fly.  Overall, these are things we want to optimize, but false alarm rate is a major conflicting objective.

WMC is a project out of Georgia Tech which deals with computing trajectories for aircraft on approach to runways to land.  The computations rely on physics, aircraft type, and introduces cognitive measures that model the manner in which pilots take action in approaching the runway.  WMC is a simulation.  Taking only one aircraft at a time along with its flight plan and a starting point, starting velocity and starting altitude, it simulates the landing of that aircraft, yielding tracking data throughout to its completion.

After WMC simulates the landing of an aircraft, it can add its tracking data into TTSAFE along with the rest of airspace.  TTSAFE then determines if such a landing approach is safe, and if not, then resolutions are given and WMC re-computes the landing approach, and so on.  Since WMC only deals with landing, we need to realize that our "birds" here only deal with local airspace surrounding, say, 50 miles around an airport.  So there is no need to consider cross-country flights from takeoff to landing; but instead we would be in a sense, "spawning" aircraft randomly inside the 50 mile radius around an airport.  To further stress-test the system and emphasize the power of GALE, such extreme circumstances can be invented in airspace that otherwise might not naturally occur.  For example; what if an aircraft is hijacked and begins ignoring commands from ground control?  Or more calmly, what happens if some aircraft in general ignores a command and doesn't adjust flight plan?

How to Fly The Birds


TTSAFE and WMC on their own are fully developed.  The interaction between the two is not.  I'm wondering if it such a task could be completed timely in few short months.  TTSAFE is coded in Java, and WMC is coded in c++0x.  I can run them each on their own.  Furthermore, GALE is coded in Python.  Unless a clever bash script can tie everything together, I could be stuck figuring out how to fly the birds.  The main problem is figuring how to pass data between all three.  File I/O might be a bad idea, but the only probable one.  Going forward, I suppose the best thing is to take it one step at a time.  After all, four years ago I never thought I'd be able to be here because I looked at it as one giant step.  Instead, the many small steps are what got me here - and they are also the way forward.

Step 1) WMC has a problem with simulating only an aircraft of a predefined type.  It needs to be adjusted so that it simulates for an aircraft of an input type.  Thus, it should be able to read its parameter information from the BADA database for that aircraft type.

Step 2) Develop a script which can run WMC a bunch of times for random aircraft types, along with randomized parameters of decision nature (such as those for the cognitive models).

Step 3) Try to tie GALE with this WMC script from step 2.  If we can do this, then any roadblock with further connecting TTSAFE with everything should be understood and made simpler.  Then, although the practicality of optimizing WMC may not be understood, at the very least we can see how GALE runs with WMC (and how it runs using similar algorithms like GALE, i.e. NSGAII or SPEA2).    Note that I'm most worried about the "if we can do this" part.

Step 4) Make a script which generates an airspace of aircraft with variety of flight plans around a small radius about an airport.  This will be an input to the overall system that we ultimate aim to have.

Step 5) Feed the airspace of step 4 into WMC to generate accurate trajectory data for all aircraft.  This means we adjust the script of step 2, so that it instead doesn't generate random aircraft, but instead takes input from the airspace of step 4.  As for cognitive decision parameters, we use what we learned from step 3.

Step 6) We need a script now that feeds data from WMC back into TTSAFE.  Basically, we just need a way to adjust the tracking data in the airspace input file (made initially in step 4).

Step 7) Lastly, a script that combines all of these into a loop that runs until all aircraft land.  Then we compute statistics and metrics for the process - stuff we want to optimize.

Step 8) Step 4-7 become our ultimate model.  Whatever we call this, we then feed it through GALE vs NSGAII vs SPEA2 to optimize things and learn things, blah, blah.

Step 9) Adjustments, twitches, fixes.  Looking for and getting sane data from step 8.

Step 10) If we have sane data, then can we publish it?  i.e. can we go forward and put it all into a thesis proposal?