Monday, December 9, 2013

On FastMAP: Technique for Dimensionality Reduction

FastMap is a technique for dimensionality reduction.  Consider a dataset consisting of n rows, where each row is a vector of p attributes.  That is, the dimensionality of that dataset is p.  The motivations behind dimensionality reduction are many.  For instance, consider the visualization of such a dataset.  Each row in the dataset can be viewed as a point in p-dimensional space.  But if p is larger than 3, it becomes unworldly to visualize.  Even still, visualization may not explain anything.   Reduction of dimensionality sees more practicality from a computational standpoint for some analysis of the dataset.

FastMap selects a sequence of k of the p dimensions (k <= p), and projects the points from the underlying true-space (p-dimensional space) onto the axes of k-dimensional space.  It was first introduced by [1] in 1995, and identified as an alternative to Multi-dimensional scaling in [4].  FastMap is a generalization of Principal Components Analysis (PCA) [2].

PCA was introduced in 1901 by Pearson [3] as an analog to the principal axes theorem, and later formalized by Hotelling in [2].  PCA can be done by either an eigenvalue decomposition of a covariance matrix from the data, or by a singular value decomposition method of the dataset itself.  The result of PCA is essentially a list of the p-dimensional axes, such that the first principal component (the first axis in the list) has the largest variance along it (accounts for most of the variance in the dataset, and each subsequent component is an orthogonal component with the most accounted variance below the previous.

FastMap employs a procedure similar to PCA.  The theoretical perspective is that each component consists of "pivots" which are the two most extreme points along that axis (remember components are axes).  Given the p-many components, each orthogonal to the previous, it becomes clear that these pivots are the vertices of a convex hull surrounding the dataset.

The FastMap procedure is as follows.  Given a dataset S, and the Euclidean distances between each member in S, we first pick any arbitrary member in S and then look for the member which is furthest away from it.  This is the first of two pivots, call it East.  The second pivot, called West, is the member furthest away from the first pivot.  With these two pivots selected, the projected coordinate (onto the 1-dimensional line) of each member in the dataset is as follows.

x = the member being projected
c = distance from East to West
a = distance from East to x
b = distance from x to West

projection(x) = (a^2 + c^2  - b^2) / (2c).

This projection is based on the law of cosines.  This process can be repeated k times to obtain k dimensions, as each repeat adds elements to the projections.  Each subsequent repeat works implicitly on the projected space as it grows, in which each pair of pivots adds vertices to the shape that surrounds the dataset in its true-space.  In this manner, FastMap is very similar to the vertex growing method of the Simplex Search technique, in which vertices are grown with the intent of converging the convex hull of the Simplex around solutions to the search space.

The connection here for FastMap and Multi-Objective Optimization is as follows.  Note that in the Simplex Search Technique, a simplex is used and the key technique is that only exterior vertices of the simplex are evaluated.  For FastMap to be used within a search technique, only those pivots of the convex hull surrounding the dataset need to be evaluated.

The theory is that if only the convex hull pivots are evaluated, good search results can be attained, but in very few evaluations.  This is the basis of the GALE algorithm for Multi-Objective Optimization.  The validation and evidence to this theory is given in results that say GALE can achieve comparable results to state-of-the-art algorithms such as NSGA-II and SPEA2.

For the reader unfamiliar with GALE: it stands for Geometric Active Learning Evolution.  Whereas standard search tools such as NSGA-II and SPEA2 are well known and extensively referenced in literature, they are "blind" in the sense that their approaches to search involve random perturbations of individuals in the dataset.  The hope and theory for those search tools is that eventually, these random perturbations will lead to better and better solutions, but only after many evaluations.  When the method of evaluation of an individual is complex and untimely, this can become a huge problem.  The advantage of GALE is that it can achieve comparable performance while greatly reducing the number of model evaluations.  In results for standard modeling problems, this speedup factor was between 20 and 89.  (Which means 20-80 evaluations versus 1600-4000 evaluations, for population sizes of MU=100.)  In terms of MU=population size, the expected number of evaluations for NSGA-II and SPEA2 is exactly MU*GENS*2, where GENS = the number of generational iterations of the search tool.  For GALE, this expected value is no greater than GENS*2*log(base2 of MU).



[1] C. Faloutsos and K. Lin, Fastmap: A fast algorithm for indexing, data- mining and visualization of traditional and multimedia datasets, ACM SIGMOD Conference (San Jose, CA), May 1995, pp. 163–174.

[2] Harlod Hotelling, Analysis of a complex of statistical variables into principal components, J. Educ. Psych. 24 (1933), 417–441, 498–520.

[3] Pearson, K. (1901). "On Lines and Planes of Closest Fit to Systems of Points in Space" (PDF). Philosophical Magazine 2 (11): 559–572.

[4] W. S. Torgerson, Multidimensional scaling i: Theory and method,
Psychometrika 17 (1952), 401–419.

[5] George Ostrouchov and Nagiza F. Samatova. 2005. On FastMap and the Convex Hull of Multivariate Data: Toward Fast and Robust Dimension Reduction. IEEE Trans. Pattern Anal. Mach. Intell. 27, 8 (August 2005), 1340-1343. DOI=10.1109/TPAMI.2005.164 http://dx.doi.org/10.1109/TPAMI.2005.164

Monday, September 9, 2013

#27 - Towards a Ph.D. Thesis Proposal: NextGen Project and Optimization Strategy

With about 7 months remaining until a target graduation date, the first month has to entail projection and proposal.  The next three months from here til the end of year entail the core work and research on that proposal.  The final four months focus on developing the dissertation culminating that core work as well as attributing work done previously.  When I'm at a loss of ideas, I find that writing them out sometimes help them come forward.  It's like that in creative writing: don't think about what to write, just let the words come out.  Creativity and ideology live in the brain; writing can be an effective way for them to channel outward.

"What are my birds?" is asked by my advisor.  I have an algorithm called GALE, which can perform multi-objective optimization in very few evaluations.  This sounds really cool, but this is purely at this point, an algorithmic state of affairs.  Who cares?  What about the application of such an algorithm and its attributes to bettering the world?  This is the algorithms vs applications conflict.  Many theses and topics of research live purely in the world of algorithms, but lately a push for application-world research is called for.

GALE can optimize stuff, and learn stuff in few evaluations of the model.  If I want a thesis out of this, I need to find a way to tie the importance of this into the applications world.  Why would doing things faster be a good thing?  The only thing I can think of is for the affairs of safety-critical devices where speed is a requirement.  Secondly, what does it mean to learn stuff in few evaluations, blah, blah, really need to jump away from algorithmic-speak.

What The Birds Are


Overall, I have about four things in general that can be called my birds here.  First and foremost I have an algorithm called GALE.  This algorithm is a tool for software engineering researchers and intelligent systems design.  They key here, is that GALE can be used to aid in the development of systems that can analyze its environment and make expert decisions very quickly.  The obvious bit here is that it might be highly critical of the system to make those decisions quickly, e.g. consider systems where decisions affect the safety of human lives.  Furthermore, if the system cannot make those decisions quick enough, say, to react to very unexpected and sudden environment changes, then the safety of lives might also be endangered.

The the algorithmic world which GALE lives in, it needs a simulation to study as its environment.  In the application world, this simulation becomes the environment through machine learning.  This transition is a long way off, but the connection between optimization studies (like with GALE) and machine learning (using optimization to learn) is slowly becoming stronger every day.  It might be sensible to believe that one day in the future these two fields would merge and become one.

For now, we use GALE with a simulation in the algorithm world.  The first study on GALE was performed on a simulation called POM3, in which the process of completing a software project is modeled through requirements engineering (i.e. how best to plan a strategy of completing tasks to the project).  POM3 as a simulation has a handful of decisions that can be made, as well as a set of objectives to optimize in decision making.  Remember that anytime a decision is made, the critical-thinking process here is an aim to optimize some goal - e.g. what type of car should I buy?  We have goals of minimizing cost, maximizing MPG, maximizing aesthetic appeal, etc.)  For POM3, these goals were Completion (percentage of tasks that were completed, because a lot of projects only get so far before termination), Cost (total money spent), and Idle Rate (how often developers were waiting around on other teams).  The decisions were things like team size, size of the project, and other more domain-specific decisions to software engineering.

Going back to developing a thesis proposal; POM3 doesn't really fit our needs.  We'd like a simulation that can be optimized with GALE but has some application-world use where the power of making decisions quickly is very important.  POM3 isn't really safety-critical at all.  While it was a good and meaty simulation with many decisions and objectives, it just won't cut it for proposal in a thesis that wants to live in the application world.

The NASA Birds


My last two birds are two projects from my work out in California with NASA Ames Research Center.  These two projects involve aerospace research, and while one is a simulation, the other is a tool for ensuring safety of flight.  So it sounds like right away, we might have tools on hand for a thesis if we can combine the two projects.  WMC (Work models that compute) is the simulation, while TTSAFE (Terminal Tactical Safety Assurance Flight Evaluation) is the tool for conflict detection of aircraft - to make sure they don't collide in airspace.  Sounds trivial, but there's problems that need to be kept contained.

TTSAFE merely examines an input file containing codes that deal with aircraft locations, flight plans, tracking data, velocities, altitudes, and more.  This input file feeds into TTSAFE and a conflict detection algorithm determines if there are multiple aircraft heading towards each other on a collision course.  There are three main parameters here for such an algorithm.  The first is how often TTSAFE checks airspace for conflicts (granularity).  Secondly, to identify conflicts, a line can basically be drawn starting from every aircraft in airspace, and extending along the path of its velocity and direction.  If there are any two lines that intersect, then there is a conflict between the two aircraft of those two lines.  The second parameter here is how long to drawn the lines, e.g. 3 nautical miles, 10 nautical miles (or perhaps measured in time).  Thirdly, lines may not need to intersect to conflict, but instead merely come close to each other.  So the third parameter is the safe radius around the aircraft.

Once TTSAFE identifies conflicts, it proposes resolutions to those conflicts, and flight plans are adjusted based on rules of right-of-way in airspace.  There are problems with such a conflict detection algorithm because not all identified conflicts are truly a real conflict.  For instance, some aircraft may be on their way in making a turn, so while the velocity and current directions extend a line that intersect that of another aircraft, there would never have been a conflict because the aircraft was in the process of making a turn that "tricked" TTSAFE into believing there'd be a conflict.  Nevertheless, such a False Alarm is taken seriously and the aircraft is signaled to adjust flight plan - lengthening its miles traveled; costs more; takes longer to fly.  Overall, these are things we want to optimize, but false alarm rate is a major conflicting objective.

WMC is a project out of Georgia Tech which deals with computing trajectories for aircraft on approach to runways to land.  The computations rely on physics, aircraft type, and introduces cognitive measures that model the manner in which pilots take action in approaching the runway.  WMC is a simulation.  Taking only one aircraft at a time along with its flight plan and a starting point, starting velocity and starting altitude, it simulates the landing of that aircraft, yielding tracking data throughout to its completion.

After WMC simulates the landing of an aircraft, it can add its tracking data into TTSAFE along with the rest of airspace.  TTSAFE then determines if such a landing approach is safe, and if not, then resolutions are given and WMC re-computes the landing approach, and so on.  Since WMC only deals with landing, we need to realize that our "birds" here only deal with local airspace surrounding, say, 50 miles around an airport.  So there is no need to consider cross-country flights from takeoff to landing; but instead we would be in a sense, "spawning" aircraft randomly inside the 50 mile radius around an airport.  To further stress-test the system and emphasize the power of GALE, such extreme circumstances can be invented in airspace that otherwise might not naturally occur.  For example; what if an aircraft is hijacked and begins ignoring commands from ground control?  Or more calmly, what happens if some aircraft in general ignores a command and doesn't adjust flight plan?

How to Fly The Birds


TTSAFE and WMC on their own are fully developed.  The interaction between the two is not.  I'm wondering if it such a task could be completed timely in few short months.  TTSAFE is coded in Java, and WMC is coded in c++0x.  I can run them each on their own.  Furthermore, GALE is coded in Python.  Unless a clever bash script can tie everything together, I could be stuck figuring out how to fly the birds.  The main problem is figuring how to pass data between all three.  File I/O might be a bad idea, but the only probable one.  Going forward, I suppose the best thing is to take it one step at a time.  After all, four years ago I never thought I'd be able to be here because I looked at it as one giant step.  Instead, the many small steps are what got me here - and they are also the way forward.

Step 1) WMC has a problem with simulating only an aircraft of a predefined type.  It needs to be adjusted so that it simulates for an aircraft of an input type.  Thus, it should be able to read its parameter information from the BADA database for that aircraft type.

Step 2) Develop a script which can run WMC a bunch of times for random aircraft types, along with randomized parameters of decision nature (such as those for the cognitive models).

Step 3) Try to tie GALE with this WMC script from step 2.  If we can do this, then any roadblock with further connecting TTSAFE with everything should be understood and made simpler.  Then, although the practicality of optimizing WMC may not be understood, at the very least we can see how GALE runs with WMC (and how it runs using similar algorithms like GALE, i.e. NSGAII or SPEA2).    Note that I'm most worried about the "if we can do this" part.

Step 4) Make a script which generates an airspace of aircraft with variety of flight plans around a small radius about an airport.  This will be an input to the overall system that we ultimate aim to have.

Step 5) Feed the airspace of step 4 into WMC to generate accurate trajectory data for all aircraft.  This means we adjust the script of step 2, so that it instead doesn't generate random aircraft, but instead takes input from the airspace of step 4.  As for cognitive decision parameters, we use what we learned from step 3.

Step 6) We need a script now that feeds data from WMC back into TTSAFE.  Basically, we just need a way to adjust the tracking data in the airspace input file (made initially in step 4).

Step 7) Lastly, a script that combines all of these into a loop that runs until all aircraft land.  Then we compute statistics and metrics for the process - stuff we want to optimize.

Step 8) Step 4-7 become our ultimate model.  Whatever we call this, we then feed it through GALE vs NSGAII vs SPEA2 to optimize things and learn things, blah, blah.

Step 9) Adjustments, twitches, fixes.  Looking for and getting sane data from step 8.

Step 10) If we have sane data, then can we publish it?  i.e. can we go forward and put it all into a thesis proposal?

Tuesday, July 9, 2013

Krall Numbers

I'd always been interested in this kind of number I had discovered.  What you do is take any ordinary positive integer, most typical here is 7, and you look at all of its fractions under 1.  That is, look at the fractions 1/7, 2/7, ... all the way up to 6/7.  Expand those fractions, and then check out their decimal expansions.

Here, we see each respective decimal expansion in sequence.  The most interesting thing here is that there are patterns to be found in each expansion.  The pattern for the number 7 seems to be "142857".  This pattern cyclically repeats itself all throughout each expansion for the number 7.  The needed shift to match the cyclic pattern is also shown below, and the pattern being matched is indexed at the end of each line.

0.142857142857    , shift=  0,pattern#=  1,*
0.285714285714    , shift= -2,pattern#=  1,*
0.428571428571    , shift= -1,pattern#=  1,*
0.571428571429    , shift=  2,pattern#=  1,*
0.714285714286    , shift=  1,pattern#=  1,*
0.857142857143    , shift=  3,pattern#=  1,*

Some numbers have more than one pattern.  For example, the number 8 has 7 unique patterns.

0.125               , shift=  0,pattern#=  1,*
0.25                , shift=  0,pattern#=  2,**
0.375               , shift=  0,pattern#=  3,***
0.5                 , shift=  0,pattern#=  4,****
0.625               , shift=  0,pattern#=  5,*****
0.75                , shift=  0,pattern#=  6,******
0.875               , shift=  0,pattern#=  7,*******

However some numbers are a little more interesting.  There are identifiable patterns in the pattern index itself!  For example, check out number 11, where the stars at the end form a symmetric mountain of sorts.

0.09090909090909090909    , shift=  0,pattern#=  1,*
0.18181818181818181818    , shift=  0,pattern#=  2,**
0.27272727272727272727    , shift=  0,pattern#=  3,***
0.36363636363636363636    , shift=  0,pattern#=  4,****
0.45454545454545454545    , shift=  0,pattern#=  5,*****
0.54545454545454545455    , shift=  1,pattern#=  5,*****
0.63636363636363636364    , shift=  1,pattern#=  4,****
0.72727272727272727273    , shift=  1,pattern#=  3,***
0.81818181818181818182    , shift=  1,pattern#=  2,**
0.90909090909090909091    , shift=  1,pattern#=  1,*

And for number 13, we see again another symmetric pattern in the stars:

0.076923076923076923076923    , shift=  0,pattern#=  1,*
0.153846153846153846153846    , shift=  0,pattern#=  2,**
0.230769230769230769230769    , shift=  2,pattern#=  1,*
0.307692307692307692307692    , shift=  1,pattern#=  1,*
0.384615384615384615384615    , shift=  4,pattern#=  2,**
0.461538461538461538461538    , shift=  2,pattern#=  2,**
0.538461538461538461538462    , shift=  5,pattern#=  2,**
0.615384615384615384615385    , shift=  1,pattern#=  2,**
0.692307692307692307692308    , shift=  4,pattern#=  1,*
0.769230769230769230769231    , shift=  5,pattern#=  1,*
0.846153846153846153846154    , shift=  3,pattern#=  2,**
0.923076923076923076923077    , shift=  3,pattern#=  1,*

And perhaps, others have no pattern at all, such as for the number 37.  And others, have some truly odd quirks, such as in 26 when you look at the expansions for 22/26 and 23/26.  In fact, you see this a lot.

There's much we can learn and marvel at just by examining expansion sequences in this way.  You can try it yourself with this python script that I developed, located at: http://pastebin.com/a5UY0F0i.  To run the script, use "python krall_numbers.py 7" to run the test on the number 7.  Simply replace the number 7 with any number you want to experiment with.


Friday, April 19, 2013

#26 - The Art of Game Design - Chapter 30, 31, 32

These three chapters all focus on things a bit beyond the game.

Chapter 30 - The Game Transforms the Player


Jesse Schell brings up the topic of violence in video games.  I can't say I have any belief that the game transforms the player, but I do believe games change the way players think.  That is, games can be inspirational, but so can movies or books.  A lot of my core theories on fun center themselves around my big three - books, movies, games.  For me, they're all the same in some general theory on fun.  The second major topic in this chapter concerns the habit of addiction.  Agreeably, games are addictive.  Most entertainment simply is, however.

Chapter 31 - Designers have Certain Responsibilities


As a game designer, you find it in your interests as a hobby.  When you design games for the industry however, you are now representing the industry.  As a consequence, the industry also defines you.  Carrying that definition with you assigns you with responsibilities.  When you realize that your game can transform people, you realize that it is your duty to transform them positively.

Chapter 32 - Each Designer has a Motivation


Any game designer needs to understand their motivation.  It is the reason they get the job done.  But the question is, what exactly is your motivation?  And if it isn't worth your time, your motivation isn't strong enough.

Thursday, April 4, 2013

#25 - The Art of Game Design Chapters 21,22,25

Chapter 21 - Some Games are Played with Other Players


This chapter introduces the concept of playing with other players, by stating that humans try to avoid being alone.  (Most humans at least.)  Multiplayer components are important to some games, because it provides humans a way to play the game and avoid being alone.  This chapter is largely just a precursor to the next, in which communities are the formed center of discussion in multiplayer games.

Chapter 22 - Other Players Sometimes Form Communities


The discussion of a community brings many topics to the table.  Players must have a way to take part in the community as though it were a real-life community.  For this, the lens of expression is introduced.  It is mentioned that at the heart of any community is conflict, but this is not always true - some games are cooperation based, and some are based purely on meeting others; i.e. a tea-party. moderators.

The lens of griefing is introduced to discuss the topic of players misbehaving in the game society.  Any community needs to be managed and policed; typically through game masters and


Lens #85: The Lens of Expression
Lens #86: The Lens of Community
Lens #87: The Lens of Griefing



Chapter 25 - Good Games Are Created Through Playtesting


Playtesting in this game is discussed as a component in game design that is necessary to ensuring and enhancing the fun of the game.  Although the author admits to hating to do playtesting; it is a crucial stage of design that he splits into four groups: focus groups, usability testing, playtesting (general testing) and QA testing.  Focus groups or surveys are sometimes powerful tools when done right.  Questions of interest to focus on are who (does the survey/testing), when (are you testing), what (are you testing), why, and where.


Lens #91: The Lens of Playtesting

Saturday, March 23, 2013

#24 - The Art of Game Design Chapter 18,19,20

Chapter 18 - Worlds contain Characters


The breakdown of different character type adds some depth to my own understanding of a game, based on the three types of main media - books, movies and games.  But its important that a video game character be of any type  - they should all work.  For instance, the dimensions of character complexity can be a parameter to a game.  Another point to emphasize is that characters should be memorable, and one that adds to this is the complexity of the character name itself.  Names like Samus Aran come to mind - these type of names you don't see in the real world.  Because of their uniqueness, they tend to stick out and bring about an easier attachment to the overall experience (being or meeting) in the character.

An excellent example related to character trait is Chrono Cross.  The lead programmer for that game writes a script that "translates" the style of speak for the characters depending on which character it is that accompanies you throughout the main functions of the game.  The  story told by the character is the same regardless of which character you take.  But for example, you may take a pirate-type character or a french-dame with you, and the language is converted to reflect the type of character.

Lens #75: The Lens of the Avatar
Lens #76: The Lens of Character Function
Lens #77: The Lens of Character Traits
Lens #78: The Lens of the Interpersonal Circumplex
Lens #79: The Lens of the Character Web
Lens #80: The Lens of Status
Lens #81: The Lens of Character Transformation

Chapter 19 - Worlds Contain Spaces


The space of the world is that physical design world where the character can roam.  I would add to this, the manner in which the characters move in the world.  For instance, a lot of games emphasize continuous smooth movement, and some games focus on grid-based cell movement (like chess).

Lens #82: The Lens of Inner Contradiction
Lens #83: The Lens of The Nameless Quality

Chapter 20 - The Look and Feel of a World Is Defined by Its Aesthetics


This chapter seems like a recap of a lot of previous Lenses.  As a summary on aesthetics, it can help bring your game to life and add to the memorable experience.  It takes a real graphical eye to see that the aesthetics are proper - and it depends on the kind of game you build.  For example, a non-photorealistic approach may be more appropriate over a cell-shading cartoon style.  As another detail, sometimes people don't notice details.  The level of detail used may depend on the environment.  That seems funny to say.

Sunday, March 17, 2013

#23 - JMOO Source Code Release

Search Based Software Engineering brings us a need to simulate real world problems and experiment with optimization of objectives.  For this, we need a way to model the simulations and the algorithms used to optimize objectives.  Formally, this is the field of Multi-Objective Optimization (MOO).  Sometimes, when there is only a single objective optimize, it is called Single-Objective Optimization instead.  In slightly older times, SOO was the more-studied, simplified case before its maturation and generalization into MOO.

MOO can best be described in mathematical terms.  MOO is used to optimize a Multi-Objective Problem (MOP).  A MOP consists of a set (X1, X2, …, Xn) of n decision explanatory variables, a set (Y1, Y2, …, Yk) of k objective response variables, and either a set of objective evaluation functions (f1(X1, X2, …, Xn), f2(X1, X2, …, Xn), …, fk(X1, X2, …, Xn)) or an evaluation model, either of which assigns values to each of the k objectives.   The decision variables each have their respective lower and upper bounds and in the case of constrained MOPs, the set of decisions as a whole is constrained to some set of bounds.  The objectives include an indication of which direction (minimizing, or maximizing) is optimal.  Typically, the optimization direction is ignored and an assumption is made, but I feel a need for its generalization.



As a broad overview, MOO is both: 1) an MOP and 2) an Algorithm to solve the MOP.  We just described the MOP above.  Now we discuss algorithmic approaches.  Most used is the MOEA (Multi-Objective Evolutionary Algorithm).  Two very state-of-the-art MOEAs are NSGA-II and SPEA2.  The MOEA is a standard type of algorithm that we build upon when presenting JMOEA (Joe's MOEA).

1. Initialization
2. Load Initial Population
3. Collect Initial Stats
4. Generational Evolution:
 - - 4a. Selector
 - - 4b. Adjustor
 - - 4c. Recombiner
 - - 4d. Collect New Stats
 - - 4e. Evaluate Stopping Criteria
5. Generational Representative Search

The outline above constitutes our JMOEA.  First, parameters are initialized, including MU, which defines the number of individuals in a population.  Each individual is a listing of decisions and objective fitness (when evaluated).  In the second step, an initial population is loaded - not generated.  This is so that when two algorithms try to optimize the same problem, they both begin with the same initial set of individuals.  Thirdly, stats are collected for the initial population.  The initial stats collection involves setting the reference point at the median of the population.  This median is then used to calculate our measures of quality and spread of the individuals in the population.  These measures help us determined how good we're doing in the algorithm.  Other stats include the medians of each objective and the number of evaluations accumulated thus far.

The core of our JMOEA is in step four when evolution starts.  Evolution is carried for a number of generations or until the stopping criteria of step 4e says to stop.  The methods of 4a, 4b and 4c are the steps that define the algorithm.  The selector first identifies a number of individuals from the population to use as "selectees", or offspring.  Then, the adjustor in step 4b modifies these selectees (usually per some chance rate).  Finally in 4c, the selectees and population are recombined, to either prune the size back down to MU-many individuals, or to grow back up to MU-many individuals.

The definition of an algorithm follows by assigning specific methods to each of 4a, 4b and 4c.  For instance, NSGA-II is defined when the selector is a tournament selection, adjustor is crossover and mutation, and the recombiner is the NSGA-II elite method (as described in its original paper reference).  SPEA2 is similarly defined, except its recombiner is defined to be the SPEA2 elitist method.

Finally, in step 5, we search through all the generations of the evolutionary process until we find a representing generation; ideally one of the better generations that we can use to say this is how good the algorithm worked.

The code that we use to develop JMOEA is contained in our JMOO package (Joe's MOO).  JMOO allows us to define decisions, objectives, problems, algorithms, individuals and the JMOEA.  It is very versatile in that new problems can very easily be defined, as well as new algorithms.  For fun: JMOO is our purple cow:


The code repository (python) for jmoo: http://unbox.org/things/tags/jmoo/
The rrsl method in jmoo_algorithms.py will require parts of http://unbox.org/things/tags/rrsl/
And the POM3 problem is at: http://unbox.org/things/tags/pom3/


Thursday, March 14, 2013

#22 - POM3 Code Release

POM3 is the name given to my adaptation of POM2 (2009).  Similarly, POM2 is an adaptation of the original in 2008, POM.  Named for its originating authors, Port, Olkov and Menzies, POM was a simulation of the Requirements Engineering process in Software Engineering.

The job of the Requirements Engineer is to decide what gets implemented in a software project.  Typically this is done via priorities, but the prioritization strategies differ.  Traditionally, the Requirements Engineer deployed a Plan-Based strategy in which all requirements were prioritized once at the beginning of the project and never altered throughout the course of development.  In the alternative Agile development strategy however, requirements get re-prioritized during every iteration of development.  The question which POM tried to answer, is "which strategy is better?"

To answer the question, POM had to simulate many software projects very quickly.  And secondly, it had to evaluate those simulations.  Formally, this problem fits into the realm of Search Based Software Engineering. POM had become a Single-Objective Problem.  For the originating authors, the search for answering which strategy is better was a manual one involving the plot of parameters given 1,000 trials, and graphs.  In other words, a very inefficient approach at eyeing the results.  The authors of POM2 however, deploy a search technique in the algorithm known as KEYS.  In either case, the results are similar.  Plan-Based strategy is good in some cases, when requirements are stable, but Agile is better for when requirements are volatile and prone to change throughout development.

For more information, refer to references section.  [1] is the POM paper, and [2] is the POM2 paper.

POM3 however is no longer a Single-Objective Problem and is no longer a simulation intended only to focus on Requirements Engineering prioritization strategies, but rather a general software project simulation used to control our additional objectives in Cost, Idle and Completion (with Score).  Below, we discuss both the decisions and objectives of POM3 and then give an outline of the code, followed by some samples of the code.

POM3 Decisions

Decisions are those inputs to the POM3 simulation.  They are as follows.

  1. Culture.  The percentage of personnel on the project that thrive on chaos.  Some projects are chaotic.  That is, a lot of requirements change and are unstable, i.e. volatile.
  2. Criticality.  The impact to cost that safety-critical systems have upon projects.  Some projects have safety-critical requirements.  That is, some requirements are safety-critical in the event that failure absolutely cannot happen for the safety of human lives.
  3. Criticality Modifier.  The percentage of teams that are dealing with safety-critical parts of the project.
  4. Dynamism.  How dynamic is the project?  This affects the rate at which new requirements are found.
  5. Interdependency.  A percentage of how many requirements depend on other requirements (assigned to other teams in the project).
  6. Initial Known.  The percentage of the project that is initially known.  In the real world, we just don't know this, but we can guess.  In software world, all requirements exist, but some are marked as hidden.
  7. Team Size.  The size of teams, in terms of how many personnel are on each team.
  8. Size.  The project size.  Very small, small, medium, large, very large.
  9. Plan.  The requirements prioritization strategy.  We provide 5 different schemes.

POM3 Objectives

Objectives are the outputs of the POM3 simulation that score how well the project was finished.

  1. Score.  A score indicating the performance of the project's simulation.
  2. Cost.  The average cost per completed requirement across all teams.
  3. Idle.  The overall percentage of idleness of the project's teams.
  4. Completion.  The overall percentage of how many visible requirements were completed.

POM3 Algorithm

The outline of the POM3 simulation is as follows.  We avoid core details while providing the code.

  1. Initialization.
  2. Requirements Generation.
  3. Teams Generation.
  4. Shuffling.
    • For Each Team, Do: 
      1. Assess Budget Information
      2. Collect Available Tasks
      3. Apply Sorting Strategy
      4. Execute Available Tasks
      5. Discover New Tasks
      6. Update Task Priorities
  5. Scoring.
This outline is contained in the pom3.py file.  The initialization step processes the inputs by containing them in a pom3_decisions class structure.  The number of shuffling iterations is determined.  And then we move on to step 2, where requirements are generated into a heap.  The code for this is handled in pom3_requirements.py, which further makes use of the pom3_requirements_tree.py file.  In step 3, we generate teams and assign parts of the requirements heap to each team.  The code that manages the assembly of teams is in pom3_teams.py, and the code that manages each team individually is in pom3_team.py.

Finally, shuffling occurs.  Only a few iterations of shuffling occurs, as determined back in the initialization step.  In each iteration, requirements are shuffled through each team over a series of steps.  At first, the team assesses their budget information and collects the money that is allocated to them based on their workload.  Secondly, the team collects any tasks that they can possibly work on.  Thirdly, they sort through these available tasks, according to some sorting strategy (requirements prioritization strategy).  Remember when I said Plan-Based strategies don't do this every iteration?  Well, in POM3 something happens during every iteration, so we're ignoring the traditional strategy for the sake of simplicity.  But it would be relatively simple to reintroduce it properly.  The reason we ignore it, is because quite simply, we don't care too much about it.

The fourth step of shuffling involves the execution of available tasks as possible, within budget limitations as assessed in step one of shuffling.  Fifthly, we look to discovering new tasks; which basically, all this means is marking some more requirements of the heap as visible which were previously invisible.  Finally, tasks are all updated on their 'values', i.e. priorities.  By the way, the code for each shuffling step is all contained in the pom3_team.py file, as all the work here is for each team individually.

Lastly, now that the simulation is finished, we score our objectives.  Score is the funkiest objective, as it considers each completed task's final priority value, as updated in every iteration as the optimal_value.  The true value is marked as well as they are completed in real time, before they are further updated via the last shuffling phase.  As a ratio, the score is calculated as frontier/optimal_frontier, where frontier = value/cost, and optimal_frontier = optimal_value / optimal_cost.  In the code, we refer to frontier as "our frontier", and the optimal frontier as the "god frontier", as it is information only a god-like entity could possibly know (we don't actually update the completed tasks in real life; but they get considered in the simulation for scoring purposes).

Extra

By the way, here's some fun python stuff.

class pom3_decisions:
    def __init__(p3d, X):
        p3d.culture = X[0]
        p3d.criticality = X[1]
        p3d.criticality_modifier = X[2]
        p3d.initial_known = X[3]
        p3d.interdependency = X[4]
        p3d.dynamism = X[5]
        p3d.size = int(X[6])
        p3d.plan = int(X[7])
        p3d.team_size = X[8]

The code above is a sample from POM3.  A simple class structure makes any instance of the pom3_decisions class an effective record of sorts.  When one instances the pom3_decisions class, they are making use of the class constructor (def __init__).  Now, the cool part is that unlike languages such as java, I can give my own name to the "this" reference.  In the sample above, I use the name p3d, and assign class data using it.  This is no different than using python's default "this" name, which is "self".  But sometimes, just sometimes - you can make the code look like syntactic candy with just the right name.

class pom3_requirements:
    def __init__(requirements, decisions):
        requirements.heap = requirements_tree()
        requirements.count = int(2.5*[3,10,30,100,300][decisions.size])
        requirements.decisions = decisions
        
        for i in range(requirements.count):
            
            ...
            
            parent = requirements.heap.tree[i]
            requirements.recursive_adder(parent, 1)

A heap is a bunch of trees crammed together.  Requirements.heap.tree[i] accesses the i'th tree of the heap.  Looks just like English.  But it's not, it's Python!

References

[1] D. Port, A. Olkov, and T. Menzies, “Using simulation to investigate requirements prioritization strategies,” in Automated Software Engineering, 2008. ASE 2008. 23rd IEEE/ACM International Conference on, Sept. 2008, pp. 268–277.
[2] Lemon, B.; et al., "Applications of Simulation and AI Search: Assessing the Relative Merits of Agile vs Traditional Software Development," in Automated Software Engineering, 2009. ASE 2009. 24th IEEE/ACM International Conference on, Nov. 2009, pp.580-584.

Friday, March 8, 2013

#21 - The Art of Game Design

Chapter 13 - Players Play Games Through an Interface


The interface provides to the player a window into the virtual world.  It must be fluid and allow for playability.  When the interface can no longer be fluid, it becomes an obstacle to the player.  The window has been elaborated more in this chapter.  While the interface is also important, feedback is equally important.  Actions sent into the game via the interface must be reported back to the player immediately.  When there is no feedback, the player is unable to learn the dynamics of the game, and will never understand how to play it.  The same can be said of the real world - feedback helps us learn.


Lens #53: The Lens of Control.
Lens #54: The Lens of Physical Interface
Lens #55: The Lens of Virtual Interface
Lens #56: The Lens of Transparency
Lens #57: The Lens of Feedback
Lens #58: The Lens of Juiciness
Lens #59: The Lens of Channels and Dimensions
Lens #60: The Lens of Modes


Chapter 14 - Experiences can be Judged by their Interest Curves

The game should build upon more and more interesting events, peaking up and down, before it peaks at the end for a grand finale.  This idea is described as the interest curve, and a number of factors go into describing aspects that add to or deter interest in the game.  A player's interest needs to be obtained very early, and it needs to be kept.  But after a player's interest has been piqued, it can only be held for so long.  When an event that is something a little *more* interesting occurs, it re-rises the player interest and when done smartly, the player can enjoy the game throughout until its end.

Lens #61: The Lens of the Interest Curve
Lens #62: The Lens of Inherent Interest
Lens #63: The Lens of Beauty
Lens #64: The Lens of Projection

Sunday, March 3, 2013

#20 - JMOO

MOO (Multi-Objective Optimization) is a field of Software Engineering that involves the solving of problems that have decisions which affect the objective goals.  Formally, the mathematical expression of the problem is as follows.

A Multi-Objective Problem (MOP) consists of a set of n decisions and k objectives.  The decisions all have upper and lower bounds as well as constraints that the decisions must all fall within.  The objectives to a problem are similar to decisions, but we do not know their bounds.  Instead, we just know our goals, which are to maximize or minimize each as specified by each objective's optimization direction.  The objective functions provide the evaluation model of mapping decisions to objective scores.  However sometimes, the problem doesn't have a simple set of functions for this; instead it may rather be a process or a set of steps that cannot be easily modeled.  Either way, the objective functions provide a way to map decisions to objectives.

JMOO is our modeling language (Joe's Multi-Objective Optimization), implemented in Python.  The framework is extremely simple and provides an easy way of devising your own MOPs.  Below we show code for implementing the Srinivas MOP.


class srinivas(jmoo_problem):
  def __init__(prob):
    prob.name = "Srinivas"
    prob.decisions = [jmoo_decision("x" + str(i+1), -20, 20) for i in range(2)]
    prob.objectives = [jmoo_objective("f1", True), jmoo_objective("f2", True)]
  def evaluate(prob,input = None):
    if input:
        for i,decision in enumerate(prob.decisions):
            decision.value = input[i]
    x1,x2 = prob.decisions[0].value, prob.decisions[1].value
    prob.objectives[0].value = (x1-2)**2 + (x2 - 1)**2 + 2  
    prob.objectives[1].value = 9*x1 - (x2 - 1)**2
    return [objective.value for objective in prob.objectives]
  def evalConstraints(prob):
    x1,x2 = prob.decisions[0].value, prob.decisions[1].value
    if (x1**2 +   x2**2 > 225): return True
    if (x1    - 3*x2    > -10): return True
    return False


Each MOP is implemented as a subclass to the jmoo_problem base class.  The __init__ method provides information about the MOP's objectives (names, optimization direction) and decisions (name, lower bound, upper bound), using list comprehensions to put them in lists.  The use of decisions and objectives work with the jmoo_decision and jmoo_objective classes, which both act as relatively simple record data types.  The base class implements a hidden generateInput method which will generate a set of inputs compliant to each decision's bounds and the constraints (using the evalConstraints method).  If the problem has no constraints, then the evalConstraints method simply returns False.  The evaluate method on the other hand, will evaluate the decisions and set the objective values.  If you simply call the evaluate problem with no parameters, it will evaluate the decision's current values, but you must be sure to call generateInput before hand.  You can however, supply your own decision values as input if using generateInput doesn't suit your needs (due to generating randomly).

To simplify the process of adding your own MOP, use the following as a scaffolding.  To see how simple it is, just observe the bolded italic lines as being the only lines you need to adjust.


class my_new_mop(jmoo_problem):
  def __init__(prob):
    prob.name = "What to call my new MOP?"
    LOWS = [1, 2, 3, 4, 5]
    UPS = [10, 20, 30, 40, 50]
    OD = optimalDirections = [True, True, False]
    n = numDecisions = 5
    k = numObjectives = 3
    prob.decisions=[jmoo_decision("x"+str(i+1),LOWS[i],UPS[i]) for i in range(n)]
    prob.objectives=[jmoo_objective("f"+str(i+1), OD[i]) for i in range(k)]
  def evaluate(prob,input = None):
    if input:
        for i,decision in enumerate(prob.decisions):
            decision.value = input[i]
    X = [prob.decisions[i].value for i in range(len(prob.decisions))]
    prob.objectives[0].value = 1 # TODO
    prob.objectives[2].value = 2 # TODO
    prob.objectives[3].value = 3 # TODO
    return [objective.value for objective in prob.objectives]
  def evalConstraints(prob):
    X = [prob.decisions[i].value for i in range(len(prob.decisions))]
    #TODO: if (expression involving the X falls out of bounds): return True
    #TODO: if (expression involving the X falls out of bounds): return True
return False


Friday, March 1, 2013

#19 - The Art of Game Design - Chapters 7,8,9

Chapter 7 - The Game Improves through Iteration


I'd read a paper once that described how they made a game in 4 weeks, but it sucked.  After careful reconsideration and further iteration over the course of 12 more weeks, they polished the game into a fine product which became one of the best games on the mobile market.  It doesn't take a long time to make a game, but to make a finely polished game is something entirely different.  Unfortunately, thanks to the great video game crash of 1984, we realize that players don't just want games, they want great games.

Lens #13: The Lens of the Eight Filters
Lens #14: The Lens of Risk Mitigation
Lens #15: The Lens of the Toy

Chapter 8 - The Game is Made for the Player

Know your audience and know how to entertain them! An entire chapter is here that describes how to do it and introduces several taxonomies that I have never seen before.

Lens #16: The Lens of the Player
Lens #17: The Lens of Pleasure



Chapter 9 - The Experience is in the Player's Mind


In this chapter we talk a lot about experience from the perspective of the player. One topic that comes up a lot in my research is Flow. A game needs to be challenging enough without being too difficult. This makes up an "Aspect of Replayability" in my characterization of games in the aspect of Challenge. The other five aspects are Social, Experience, Mastery, Impact and Completion (SChEMICo).

Equally important however is motivating the player to get into your game at all. This is done by Playability. We need to both judge what entices the player and what keeps them happy in the game.

Lens #18: The Lens of Flow
Lens #19: The Lens of Needs
Lens #20: The Lens of Judgment

Friday, February 22, 2013

#18 - The Art of Game Design - Chapters 4,5,6

Chapter 4 - The Game Consists of Elements


Much like a doctor needs to really know the human anatomy, a game designer needs to know what makes a game tick.  Schell labels the four elements of a game as being: Mechanics, Story, Technology and Aesthetics.  I couldn't agree more.

Lens #7: The Lens of the Elemental Tetrad.  Is your game comprised of these elements?
Lens #8: The Lens of Holographic Design.  Do you understand what contributes to your play experience?


Chapter 5 - The Elements Support a Theme


A theme in the game helps express the game as meaningful.  Largely, it was very hard to express meaning in a game until such technology arose to the occasion.  And when there is a meaning, there is a theme.  So to add meaning, support your theme!

Lens #9: The Lens of Unification.  Does your game have a theme?  What are you doing to strengthen it?
Lens #10: The Lens of Resonance.  What are you doing to make your player very excited to be in the game?


Chapter 6 - The Game Begins with an Idea


The story by Schell being a young juggler is cute.  Inspiration comes from everywhere - as long as its not from games that look similar to your game or from other designers.  That is, true inspiration cannot be taken, it must be listened to.

Lens #11: The Lens of Infinite Inspiration.  Is there a real life experience that lends its inspiration to the game?

Game design is still design, and design involves solving a problem.  The first step to game design is to identify a problem.

Lens #12: The Lens of the Problem Statement.  What is the problem?  In design, we need to solve it.

Another source of inspiration: your subconscious.  In this chapter we have a number of interesting segments to read discussing how dreams have helped solve problems.  So, go dream!

#17 - IBD - Indicator Based Distance

IBD is what I'd like to call an indicator based distance measure between two points.  It works exactly the same way in which a binary quality indicator works.  To recap, we're talking about optimizing multiple objectives (MOO), typically when there are trade-offs among the objectives in a multi-objective problem (MOP) with any number of decision (input) variables.  To perform the optimization, an evolutionary algorithm (MOEA) is typically applied, until there is an acceptable convergence of optimization across the objectives.

A recent class of MOEA that has been researched is the Indicator Based Evolutionary Algorithm (IBEA), in which a binary quality indicator is used as a means to improve each generation as opposed to standard domination measures.  Standard domination asks the question of whether an individual and its objective scores is dominated by another individual, which requires total winning across each objective and never losing.  But this question is not an informative question; regardless of whether we dominate or not, we lack knowing by how much domination occurs, or not.  And so an indicator based approach was suggested instead.

Zitzler proposes a indicator to use for IBEA in [1], which is as follows.  The "loss in quality" is measured between an objective and its residing population using formulation shown just below:

Figure 1.  The IBEA quality indicator for a point, X1.  I(P1, P2) =  delta(P1, P2).
To replace standard domination between two points, P1 and P2, we first compute F(P1) and F(P2) using the IBEA quality indicator above, and then compare F(P1) versus F(P2).  F(P1) measures the loss in quality of removing P1 from the population, while F(P2) measures the loss in quality of removing P2 from the population.  If F(P1) is a higher number than F(P2), then it has a higher loss in quality, and is thus more important to keep in the population.

Going further; we have need for a summarizing method to distinctly identify how well we have optimized the search space.  For this, the hypervolume indicator is typically used with respect to a reference point.  However, hypervolume is a very complex metric to compute as shown just below.  I want to propose an Indicator Based Distance as follows.

Figure 2.  The red box outlines the hypervolume to be calculated for the set of blue points and a purple reference point.

Using the reference point as part of the population, R, compute the IBEA quality, F(R), measuring the loss in quality of removing R, which should not be very high - since the reference point is typically close to "Hell" - the worst possible space of the search space.  Then we also calculate F(Pi) for each member of the population.  We can then create an i-length vector in (F(R) - F(Pi)) which is a list of values measuring their distance back to the reference point.  The average of this vector approximates the hypervolume metric and is much simpler to perform.

Figure 3.  Instead of euclidean distance to each blue population point, we use a differences of qualities method to approximate distances to the purple reference point.  Note that euclidean distance would be a poor metric, as the black pareto-frontier curve contains varying distances to the purple reference point.

The reference point can be chosen by electing the median of the initial population.  For each iteration of a generational MOEA approach, I propose the above as an Indicator-based distance (IBD) metric to summarize how much optimization has occurred since the algorithm's start.  In this way, two MOEA's can more easily be compared.

[1] E. Zitzler and S. KünzliIndicator-Based Selection in Multiobjective Search. In Conference on Parallel Problem Solving from Nature (PPSN VIII), pages 832--842, 2004. Springer

Friday, February 15, 2013

#16 - The Art of Game Design - Chapters 2 and 3

Chapter 2 - The Designer Creates an Experience

The entire purpose to creating a game is to create for the consuming player, a memorable experience that the player can take with them beyond the game.  The question of how to create an experience and even defining an experience are difficult ones.  There are a number of fields involved, and for me, psychology and anthropological views are some of the ones closest to my studies.  The chapter culminates in the very first lens in the book:

Lens #1: The Lens of Essential Experience.  Think about what is most important in the game to relay to them as an experience, and focus on bringing that to life.

Chapter 3 - The Experience Rises out of a Game

In this chapter, we seem to be concerned primarily with defining the field of games itself by describing definitions of key words such as "Game", and "Fun".

Lens #2: The Lens of Surprise: Not everything in your game should be predictable.  Surprise is a necessary element.
Lens #3: The Lens of Fun: Is your game fun?  That is, does it entertain players?
Lens #4: The Lens of Curiosity: Consider player motivations.  Entertainment is also engagement.
Lens #5: The Lends of Endogenous Value: Consider what is valuable to the player in the game, in terms of why the player is even there to begin with.
Lens #6: The Lens of Problem Solving: Challenges are necessary in a game as well.  And problem solving is one way to provide challenges.

Wednesday, February 13, 2013

#15 - Cause & Reaction - Introducing JMOO

Cause & Reaction - the essential driving force in life.  Think about it.  The human brain takes in a set of inputs (a configuration of the five senses as read by your bodily sensors), and through some very complex and ill-understood mannerism of the brain, those inputs are mapped to some output, i.e. a bodily action.  Of course, given a higher level capacity over brain function, we can choose to do whatever we want - but the inputs are there in front of us to analyze and evaluate, and we make a decision on what to do either way - often when it is a decision we have thought about, we label it as an action, and otherwise, a reaction.

This simple model occurs everywhere in life, inanimate or living, in massive parallel.  We just so happen to also be studying this model in computer science, but we haven't quite decided on a good name yet.  Some would prefer to use the terminology "Data Mining".


In data mining, some prefer to focus on the input space, and others prefer to focus on the output space.  In my recent work, I've been trying to learn how to optimize the output space by searching through the input space.  But, in my field, these aren't quite the same names of course.  Typically, the inputs are called decisions and the output are objectives.  I don't think we need to clutter the simple core of the idea with different names used by different fields - but it is what it is.


I've developed a convenient way to represent this model, and coded it in Python with the name, JMOO, which is short for Joe's Multi-Objective Optimization.  In the figures above, the input and output boxes aren't necessarily singular values - they're often vectors of many values.  We use the term multi-objective optimization (moo) to represent the problem of optimizing across many objectives, and it is typically a hard problem when there are trade-offs among the objectives (i.e. you can't optimize one without screwing up the other).

JMOO is extremely simple and can be used to represent an input/output model.  It features a very simple way to add new models and there are already a lot of existing baseline type models representing commonly used problems such as Fonseca, Shaffer, etc.  We can handle upper and lower bound ranges for the inputs, as well as constraints for cross-validating all the inputs together.  The inputs can be generated randomly or assigned by an outside module.  And lastly, we can evaluate the model and get the output scores for each objective.


Let's try to build the brain now.  Sounds complicated?  Not really!  We'll see where all the real work belongs though, as there are some big gaps.  And this could easily be implemented in JMOO.  As for optimizing however - there are outside modules needed where implementation of search algorithms reside.


Thursday, February 7, 2013

#14 - The Art of Game Design - Chapters 1,23,24

Chapter 1 - In the Beginning, there is the Designer


The game industry spawns from but a collection of dots on the map - the ones who spark the innovation and cast their ideas into the fires of the hot idea-baking, iron world, where ideas get pressed through the lives of many who are involved with the production of a game.  The many who are involved are responsible for its reception by those who would be labeled as the consumers, and the talents are spread across a large number of fields, from Anthropology to Fine Arts, Mathematics and Fluid Dynamics, Communication and Presentation, and so many more.  Indeed, the game designer is a jack of all trades.

Myself, I am a type of person who likes to figure out "why" - as in why do people play games.  And so I dig into the psychology behind the human brain and attempt to define "fun" as such a desired human state of mind in which entertainment is taking place.  But fun isn't always a pure feeling in the sense of receiving "pleasure chemicals" - sometimes we struggle and anguish over achieving a difficult task which is in no way a fun effort itself, yet we are pleased to say we were entertained during the process.  And in the end, a "Zilmann's Excitation Transfer" effect occurs when you finally achieve a difficult accomplishment, and the player breathes a euphoric sigh of content before continuing on to the next great challenge.

In chapter 1, we see a great overview of what game design means in the perspective of Jesse Schell.  We are told that "listening" is the single greatest talent for the game designer, and this would be true because the designer must be that jack of all trades.  In fields lacking, it's important to listen where you trod in the dark.

Chapter 23 - The Designer Usually Works with a Team


Jesse Schell talks a lot about love in the game design process.  He refers to members of the team loving what they are building, and it is a love that must be shared so as to avoid conflict of interest.  This applies to communication as well, because without a solid love there can be no passionate communication.  Beyond the aesthetics however, there are some serious points to summarize what "good communication" means.  Trust, Persistence and Honesty are just a few of them.

Lens #88: The Lens of Love.  Do (we) love what we're building?
Lens #89: The Lens of the Team.  Is the team communicating properly?

Chapter 24 - The Team Sometimes Communicates Through Documents


Documents can be used as a way to commit thoughts into writing and enable you to remember them more easily.  Documents can also serve as a way to make it seem as if you are more organized.  Most importantly though, sometimes documents are used to communicate, as so cleverly pointed out in the title of this chapter.

Lens #90: The Lens of Documentation.  What documentation do we need?

Tuesday, February 5, 2013

#13 - Joe's Theory of Fun

My Theory of Fun is largely the prime component of what I hope will be my eventual Ph.D. thesis.  A broad overview of fun examines gaming, writing, and movie production from the view of entertainment, by studying the human cognitive psychology of what "fun" means to the consuming "player".

"Fun" is a particular state of mind which is desired by the human psyche.  There are many mediums by which to achieve such a state of mind, but we choose to focus on primarily the largest three industries of entertainment - Games, Books, and Movies.  A quick look at around the web reveals that the gaming industry has largely surpassed other forms of entertainment, and stands presently at roughly $25 billion.  In a sense, this could be because the gaming industry combines aspects of both books and movies into one.

My Theory of Fun comprises an analysis of the typical consumer to the industry, whether it be books, games or movies.  Below is an overview of the player life cycle followed by some terms that comprise part of the overall model.  Perhaps at this point, the thesis would better be coined, "A Model of Fun: Games, Books and Movies."

Stages of Player Life Cycle

First Glance: The consumer first hears or sees the product in some form, typically via advertising or word of mouth.  Immediately, the consumer makes a mentally approximated value for their "Expectation" of how good the product is.  If the player's "Willingness" value is above some mental threshold, "Entry Threshold", they should like to explore the product further or perhaps even purchase it for themselves to try.

First Play: At this stage, the product is given a first testing by the consumer after having presumably purchased the product.  The "True Value" of the game is being analyzed by the player as to whether or not it surpasses their "Expectation" of how good the product is.  There are then two cases:  If "True Value" is larger than "Expectation", then the game is good, and how good depends on how much larger.  In the other case, "True Value" is less than "Expectation", and thus the product does not comply with the consumer, and is regarded as a bad-game, or a not-so-good game depending on how much lower.

Game Play: Presumably, a consumer would only reach this stage if the game is worth playing.  That is, their "Expectation" on how good the game should be was beaten by the game's "True Value".  However, "Expectation" is an ongoing premise to the consumer - the game's "True Value" will go down or up across the life-span of the game's playing.  A game should be replayable enough that the consumer gets his money worth.  Games that are very replayable retain their "True Value", perhaps at times, even raising the bar as the game is played.

Quit: The final stage is one in which every consumer is fated to reach at some point.  But to reach the stage after a lengthy game play stage after consumer "Satisfaction" point is saturated is desired.  Every consumer inherently has a "Satisfaction" threshold that must be met before they can pass the game on to others as being something they'd recommend.  As the consumer plays the game, the consumer's "Satisfaction" slowly grows until such a time when the consumer believes the game to have been played-out - when "Replayability" reaches zero.

Time Stream: This stage of the cycle lies beneath every stage and serves as a catcher for dissatisfied or bored consumers, but it also serves as a reminder that over time, the game may become interesting enough to try once more.  This has to do with a player's "Willingness" level returning back above some "Re-Entry Threshold".


Overview of Terms

Here are some terms that may portray the model of the theory of fun a bit better.

True Value: Between 0.0 and 1.0, the true value of the game is unknown and judged by the consumer.  The true value of a product is different between any two consumers, as it is a value which reflects how good the consumer thinks the product is.

Expectation: Between 0.0 and 1.0, the expectation of a product is once again a consumer-specific attribute which defines how good they expect the product to be.

Entry Threshold: A threshold value set by the consumer as to how high willingness must be for the consumer to "enter" into the player life cycle and make the purchase.

Willingness: Another value between 0.0 and 1.0 is defined as how willing the consumer is in trying the product.  When willingness is larger than the entry threshold, the consumer is willing to purchase the product for themselves and try the product.

Game Renown: A propelling effect that is the consumer's conception of how popular the game is and affects Willingness over time.

Player Intrigue: Defined as True Value - Expectation, this is how well the player likes the game.  When intrigue is positive, the player liked the game, and when negative, the player did not like the game.

Playability Multiplier: This value can be a detriment or a plus for the game's True Value.  Playability refers to how playable the game is, taking into consideration things such as game balance, and interface accessibility.

Replayability Effect: This value is in effect a velocity that affects a game's True Value, but it may not be modeled linearly - sometimes a True Value can rise and fall as the game carries the consumer along.

Satisfaction: A value which slowly grows or falls as the consumer uses the product and measures how satisfied the consumer is with their purchase of the product.  At first play, satisfaction begins at 0.0.

Satisfaction Saturation Point: A threshold by which the satisfaction value must surpass if the consumer is to recommend the product to others and pass themselves on as being satisfied with the purchase of the product.

Replayability - The amount of play time left in the game at a given time period.

Quit Effect: Willingness decreases as the product has been used to near completion.

Time Stream Effect: Willingness slowly increases while the product is not being used.  When willingness surpasses above the re-entry threshold, the consumer is willing again to use the product.

Re-Entry Threshold: A threshold value set by the player after they quit the game.  Once willingness rises above this value, the player may use the product once more.

Monday, February 4, 2013

#12 - Pysteroids: Asteroids in Python

Here's a game I developed in a very short time frame (3-5 days).  It's called Pysteroids: Asteroids in Python, and it's built with the Pygame libraries for Python - one of the simplest game design frameworks I'd ever worked with.  It has a very easy to use event-system which can allow for ease-of-use with timed events and sequencing of behavior, as well as a very simple graphics module.



Download

Download the installer from here, and run the simple installer.
Alternatively, here.
And the source is a single python script file, though it uses a few media resources (songs, sounds).

Overview

Pysteroids is basically your Asteroids clone with a few of my own innovations and tweaks.  Take control of the ship and shoot down asteroids until they are dust.  Be careful not to get hit too many times, or it's game over.

Here we present my concept of a "Zero Disconnect" interface.  From the moment you first run the game program, you are in control of the ship, and you must fly it into menu options to make your choices on how to play, or perhaps to view the Help & About or customize the ship colors through the Ship Painter.



How-To-Play

Use the keyboard to pilot the greenish ship in the center of the screen.  Fly using the Up arrow (or alternatively, the W key) to give thrust to the ship.  Rotate and turn the ship using the Left and Right arrow keys (or alternatively, the A or D keys.)  Shoot with the space bar, or a mouse click in the window.  Use both space bar AND the mouse for faster shooting!)

Steer clear of asteroid enemies, but be sure to pick up the items that can sometimes spawn, as they can boost and upgrade your ship's guns, give out free points or power-up a temporary shield.  Every 10,000 points grants an extra life!


Enjoy all kinds of particle effects and a variety of gun types.  Everything you see is generated on the fly, random and loads zero images beforehand.  When the game is over, fly into "Back" to return to the main menu or press Q on the keyboard.  This physics-intense game includes elastic collision rebounding and lots of trigonometric degree calculations.




#11 - Domination in Multi Objective Optimization

Domination

Domination in multi-objective optimization is a topic of quality indication when comparing members of a population to a search space problem.  To reiterate, a search space problem is one given a set of decision parameters, a model evaluates the decisions to score its objectives.  In mathematical terms, this is input (x), and function output (y = f (x)).

A quality indicator is used when comparing two members of a population.  That is, it is used to compare two sets of output vectors (the objective scores).  As an example, consider two members of the population for which the objective scores are as follows:

P1: [0.5, 0.8]
P2: [0.4, 1.0]

We want to know if P1 is better than P2 or vice-versa.  That is, we want to know if P1 dominates P2.  For each objective however, we will need to know whether higher numbers are better than lower, or vice-versa. In our example, let us assume low numbers are better.  We as humans, will first examine the first column, and ask ourselves in which member is the objective score lower.  Since 0.4 < 0.5, P2 receives the win on this column.  And similarly, P1 wins the second column because 0.8 < 1.0.  Since this is a tie however, each member won a column each, how do we decide which member is the more dominant?  Intuition would suggest that since P1 won its column by 0.2, but only lost its other column by 0.1, that P1 would be slightly more dominant over P2.  However, a classical definition of domination requires no losses and at least one win.

In terms of classical domination, there is no clear winner here in our example.  So we may have need at times, for something a little weaker or a definition which guarantees that someone dominates.

Quality Indicators

Quality indicators are often used to combine a vector of many output scores into a single scalar value, so that it can be compared quite easily to another population member.  They allow us to answer the questions of "how much" and "in what way" is a population member better than some other member.  Not all quality indicators are binary (comparing two members), as some are unary (yield a measure of quality on just a single member).  Each indicator yields a value between 0 and 1 (0-100%), to reflect how close to "perfect = as good as it gets" it is.  For a general overview, refer to [Ref #1].


Examples

This may become a bit more technical from this point onward.  We will demonstrate the use of a quality indicator described in [Ref #2].  F(P1) = sum (-e ^ (I {P2,P1} / k ), where I {P2,P1} is a binary indicator of choice, and k the number of attributes in P.  Here, we will use a binary epsilon indicator, which simply tells us the difference between the two attributes in the same column of two members from the population.

EX1: We want to maximize both columns.

P1: [0.5, 0.8]
P2: [0.4, 1.0]

F(P1) = -e ^ (0.4 - 0.5)/2 + -e ^ (1.0 - 0.8)/2 = -1.06
F(P2) = -e ^ (0.5 - 0.4)/2 + -e ^ (0.8 - 1.0)/2 = -0.96

Since F(P2) > F(P1) it is better.  The small difference here assumes they are rather similar.

EX2: This time, we want to maximize both columns again, but it is more obvious who wins.
P1: [0.2, 0.3]
P2: [0.8, 0.9]

F(P1) = -e ^ ((0.8 - 0.2)/2) + -e ^ ((0.9 - 0.3)/2) = -2.69
F(P2) = -e ^ ((0.2 - 0.8)/2) + -e ^ ((0.3 - 0.9)/2) = -1.48

F(P2) > F(P1) by far, as was obvious just from eyeballing the objective scores yourself.

EX3: This time, we want to minimize both columns.  We stick in negatives to apply this "weight".
P1: [0.2, 0.3]
P2: [0.8, 0.9]

F(P1) = -e ^ ((-0.8 - -0.2)/2) + -e ^ ((-0.9 - -0.3)/2) = -1.48
F(P2) = -e ^ ((-0.2 - -0.8)/2) + -e ^ ((-0.3 - -0.9)/2) = -2.69

F(P1) wins this time, as we would have thought.

EX4: Now we just want to maximize the first column, and minimize the second column.
P1: [0.2, 0.3]
P2: [0.8, 0.9]

F(P1) = -e ^ ((-0.8 - -0.2)/2) + -e ^ ((0.9 - 0.3)/2) = -2.09
F(P2) = -e ^ ((-0.2 - -0.8)/2) + -e ^ ((0.3 - 0.9)/2) = -2.09

F(P1) = F(P2)!  Just as we expected, since both columns only differ by 0.6.


References

[Ref #1] http://www.tik.ee.ethz.ch/pisa/publications/emo-tutorial-2up.pdf
[Ref #2] ftp://ife.ee.ethz.ch/pub/people/zitzler/ZK2004a.pdf



Thursday, January 31, 2013

#10 - The Art of Game Design - Chapters 10,11,12

Chapter 10: Some Elements are Game Mechanics

Let's begin with a definition on what Game Mechanics are.  But therein we have the problem - there is no clear concise, standard definition.  To me, game mechanics are the inner workings of a game, but Jesse Schell categorizes game mechanics broadly by expanding mechanics to include not only the interior mechanics but also aspects such as rules, objectives, and more.  This culminates in quite a number of lenses:

Lens #21: The Lens of Functional Space.  Consider the dimensions and physics of game world itself - the space where entities in the game exist within.
Lens #22: The Lens of Dynamic State. Consider the entities that inhabit the game world as being in "states".  What knowledge can/do they know and what actions can they perform?
Lens #23: The Lens of Emergence.  Outline the actions of the entities in the game world, and consider the consequences of such actions.
Lens #24: The Lens of Action.  Consider a subset of actions which are important to achieving goals and impacting behavior of other entities, and the relative size of such a subset.
Lens #25: The Lens of Goals.  Consider goals in the game, short-term and long-term, achievements, and the plausibility of them.
Lens #26: The Lens of Rules: Describe the rules in  your games.  
Lens #27: The Lens of Skill: Consider the skill required to play the game, and win it.  Design for challenge and fun; a flow state between anxiety and boredom.
Lens #28: The Lens of Expected Value.  Consider the odds of random events in the game, and assign values to each.  The expected value (rewards) are often valuable to examine.
Lens #29: The Lens of Chance.  Examine what in your game is chance.  Is it too random?  Too much chaos is never a good thing; but perceived control within chaos is.

Chapter 11: Game Mechanics must be in Balance

Game mechanics as described in chapter 10 above must be in balance.  This means that no particular player in a game has an unfair advantage over the other.  There are several facets to this, which are further explored in the following lenses:

Lens #30: The Lens of Fairness.  Consider fairness from the viewpoint of every possible player.
Lens #31: The Lens of Challenge.  Similar to lens #27; consider the challenge level of a game, which should be between anxiety and boredom, in an optimal "flow state".
Lens #32: The Lens of Meaningful Choices.  When a player makes a choice in the game, they should feel as if it mattered, without it being a dominant choice in itself (not a "gimme".)
Lens #33: The Lens of Triangularity.  "Triangularity" refers to a set of states of the player, in which going to one state is "playing it safe", while the other is "taking a risk for higher reward".  We are concerned if we have such a preferred mechanism in the game, and if it is in balance (i.e. a hard choice to make.)
Lens #34: The Lens of Skill vs. Chance.  Consider the balance between skill and chance in the game.  Similar to lens #29.
Lens #35: The Lens of Head and Hands.  Consider the balance between physical skill and mental skill, and be sure they match the targeted audience's preferences.
Lens #36: The Lens of Competition.  Consider the level of competition in the game from across all levels of skill.  Who can play?  Is it easy to become a master?
Lens #37: The Lens of Cooperation.  Is cooperation required and is communication readily available?  Consider social aspects in the interaction, and whether its between strangers or friends.
Lens #38: The Lens of Competition vs. Cooperation.  Consider the balance between the two, and the preference of the audience.
Lens #39: The Lens of Time.  Consider how long a game takes, and whether it can be played that long without going bored.
Lens #40: The Lens of Reward.  Do rewards make sense and are they plenty enough?
Lens #41: The Lens of Punishment.  Do punishments make sense and are they plenty enough?
Lens #42: The Lens of Simplicity/Complexity.  Consider the level of complexity in the game, and of all its individual mechanics.  Too complex is bad, but so is too simple.
Lens #43: The Lens of Elegance.  Is the game elegant?  And all of its individual elements?  Elegance can make a game a masterpiece.  Make sure all elements of the game are useful and have a purpose.
Lens #44: The Lens of Character.  Character is the opposite of elegance.  Consider the character of the game, including its flaws and humorous aspects.
Lens #45: The Lens of Imagination.  Imagination can help a player immerse into the game world.  Consider how to inspire imagination in the game.
Lens #46: The Lens of Economy.  Consider the game economy, and game money, and prices of items within - and the balance of each therein.
Lens #47: The Lens of Balance.  Culminating chapter 11 all into a single lens - is the game balanced?

Chapter 12: Game Mechanics Support Puzzles

Considering puzzles in a game, there are a few guidelines outlined in these lenses below:

Lens #48: The Lens of Accessibility.  Visualization of the puzzle must be accessible, and the player should have a rough idea of how to "dive in" and begin.
Lens #49: The Lens of Visible Progress.  Progress should be visible as it is being achieved, as well as meaningful.  
Lens #50: The Lens of Parallelism.  Parallelism refers to making multiple puzzles available at once.  However, too many things at once may not be very good.  Consider bottlenecks in which completion of earlier puzzles can constrain advancement of the game, and the connectivity of such tasks.
Lens #51: The Lens of the Pyramid.  Pyramids apparently fascinate us, because they have a single point at the top which seems to be the culmination of lower layers.  Consider building the game up in this manner.
Lens #52: The Lens of the Puzzle.  Consider puzzles in the game that make the player think, and each of the puzzle principles outlined in chapter 12.