General Purpose GPU Programming for Mobile

 

GPU_Computing

There are times that your app collects a whole lot of data (by capturing it live or by downloading from a third party, or import from email), and you want to do a whole ton of reports and visualisations inside your app. .

To do lots of computing inside your smartphone is a very challenging job, as I have experienced myself in processing video data. A small trick is to use your GPU, which could be 10-100 times faster than your CPU in calculating floating point.

Setting up a GPU shader and calculation could be a huge task for anybody who are not familiar with OpenGL. So, this excellent GPGPU library could help you a lot, all you need to do is to change the calculation inside the vsh file. I am planning to make it even easier for anybody to use in the near future.

Improve your Android emulator performance

Want to supercharge your Android emulator in Eclipse/IntelliJ, use the following, excellent tip from Stackoverflow

IMPORTANT NOTE : Please first refer to Intel list about VT to make sure your CPU support Intel VT.

HAXM Speeds Up the Slow Android Emulator

HAXM stands for - “Intel Hardware Accelerated Execution Manager”

Currently it supports only Intel® VT [Intel Virtualization Technology]

The Android emulator is based on QEMU; The interface between QEMU and the HAXM driver on the host system is designed to be vendor-agnostic.

HAXM

Steps for Configuring Your Android Development Environment for HAXM

  1. Update Eclipse: Make sure your Eclipse installation and the ADT plug-in are fully up to date.
  2. Update your Android Tools: After each Eclipse plug-in update, it is important to update your Android SDK Tools. To do this, launch the Android SDK Manager and update all the Android SDK components. To take advantage of HAXM, you must be on at least release version 17.

enter image description here

  • Download the x86 Atom System Images and the Intel Hardware Accelerated Execution Manager Driver: Follow the image below:

enter image description here

  • Install the HAXM Driver by running “IntelHaxm.exe”.
    It will be located in one of following locations:

    • C:\Program Files\Android\android-sdk\extras\intel\Hardware_Accelerated_Execution_Manager
    • C:\Users\<user>\adt-bundle-windows-x86_64\sdk\extras\intel\Hardware_Accelerated_Execution_Manager

    If the installer fails with message that Intel VT must be turned on, you need to enable this in BIOS. See description for how to do this here.

Install .exe or .dmg

  • Create a New x86 AVD: Follow the image below:

Create AVD

  • Or as for new SDK, enter image description here

Time Management for Software Project

Time, quality and cost

 The basic of project management is a balance between time, quality and cost. Pick 2 of them. And then we start discussing about how to estimate the time when we have fixed the quality and cost (just kidding, you can pick quality and time as well).

 

Time Cost Quality

Time Cost Quality – Pick 2

 In projects I have done over last 5 years, each always contains lots of unknown and challenges, as I have worked in most of the latest technology and sometimes I come to new project without any experience

 Main problems in estimating projects can be listed as below:

   - Miscommunication

   - Overestimate the team competency

   - Task breakdown not done carefully.

   - Underestimate integration between models

   - Technical unknown

   - Requirement changes

Miscommunication

This normally happens with outsourcing companies or when teams are spread between different locations. When I chat with clients or teammates, people are lazy to chat so they tend to chat short. It leads to lots of assumptions and miscommunications between people. Voice chat on skype suck energy really fast, and require both sides to communicate synchronously. It could be the miscommunication between clients and your teams or inside your team. You definitely don’t want after 3 months, you come back and the clients’ feeling is like the figure below….

Miscommunication

Well, a famous and familiar figure huh?

Overestimate team competency

I pick my team members really hard, and I also have strong belief in my people. That is why I overestimate my team member most of the time. Belief and overestimate should be separated from each other. “Keep a cold mind and a warm heart”. That means you should keep a strong feeling in your members and encourage them to do a good work, but when evaluate them for specific project, be careful to estimate.

Task breakdown & integration

These are the 2 most underestimated when managing any projects. It is easy to look at the whole and say that it would take 3 days to finish all of them. However, if you actually break it down, and allow time for writing the code, debugging, testing and fixing bugs again,the total time could be 10 days. Oh, I forgot to mention about integrating between tasks or modules. Sometimes, we assume that if task A is done, and task B is done, combining them would be easy. Poor us, it is not usually the case. The combination between module A and module B can generate a whole new level of bugs that need fixing. The bug may only happen due to specific conditions of module A and B. It is important to run lots of tests and allocate debugging time for this phase, as shown in the following figure.

Work Breadown Structure

Work Breadown Structure

Technical Unknown

The risk level of the project depends on how many unknown stuff in your project. Some of them is that the requirement is unclear, or the technical solution is not yet known or implemented by any people in the team. It requires an expert with sharp eye to know which areas are unknown and needs to allocate time for it. The problem is people don’t know what is unknown…

One cool example that strikes me a lot is how Video Player is implemented in iOS and Android. In iOS, seeking to any arbitrary frame is a piece of cake, and I assume a well-designed framework like Android would be the same. I was totally wrong. It took me 2 months to search through many different solutions and approaches to figure how to do it in the correct way.

Only when you really get into coding, you can really know which part is challenging.

How to deal with changes

Another problem is customer change their requirement based on what you demonstrate them. An Agile approach normally helps by clearing any customers’ assumptions as soon as possible and not let them wait until the finished product. By receiving early feedback, we can constantly change the products.
HOWEVER…

1 problem of the Agile approach is that customers keep changing their minds way too often. There is 1 point that we really need to lock the requirement no matter what the customers want. Striking a balance between these issues may keep customers happy and our project budget and timeline in a good shape.

Linkedin Network Analysis with ggplot2

In this part, we will demonstrate how to create a network graph like the one you see in the Linkedin InMap http://community.tradeking.com/upload/0002/3315/linkedin-inmap-jude.jpg

You may achieve the same result using igraph but in this post, we will focus on how to do it in ggplot2. We will not focus on how to download the linkedin data and format it into an appropriate data structure. We assume that you can get your data in the appropriate format like the LinkedinData (download for example), an adjacent matrix like the following:


Jeff Grif Cody Bolh Curtis Blag Eric Wood
Jeff Grif 0 0 0 0
Cody Bolh 1 0 0 0
Curtis Blag 1 0 0 0
Eric Wood 1 0 0 0

Our output will be as beautiful as below image:

Linkedin InMap with R

Linkedin InMap with R

 

We will use the following code to draw based on that data:

LinkedinData<-read.table("LinkedinData",header=TRUE)
names<-as.character(LinkedinData[,1])
LinkedinData<-LinkedinData[,-1]
rownames(LinkedinData)=names
colnames(LinkedinData)=names

LinkedinData<-data.matrix(LinkedinData)
layoutCoordinates

LinkedinList LinkedinList 0, ]

##community dectection
g ###cluster
lead<-leading.eigenvector.community(g)
temp<-layoutCoordinates
temp<-data.frame(temp,lead$names,lead$membership,colSums(LinkedinData)+rowSums(LinkedinData))
colnames(temp)<-c("x","y","name","group","size")

# Function to generate paths between each connected node
edgeMaker index.temp1<-which(LinkedinList[whichRow,1]==temp[,3], arr.ind = TRUE)
index.temp2<-which(LinkedinList[whichRow,2]==temp[,3], arr.ind = TRUE) if(temp[index.temp1,5]>=temp[index.temp2,5]){
index<-c(index.temp1,index.temp2)
}else{
index<-c(index.temp2,index.temp1)
}

fromC toC

# Add curve:
graphCenter bezierMid bezierMid<-as.matrix(bezierMid)
distance1 if(distance1 < sum((graphCenter - c(toC[1], fromC[2]))^2)){
bezierMid } # To select the best Bezier midpoint
bezierMid if(curved == FALSE){bezierMid

edge c(fromC[2], bezierMid[2], toC[2]) # X & y
,evaluation = len)) # Bezier path coordinates
edge$Sequence edge$Group ")
edge$index<-rep(temp[index[2],4],len)
return(edge)
}

# Generate a (curved) edge path for each pair of connected nodes
allEdges allEdges

#cleaning plot
new_theme_empty new_theme_empty$line new_theme_empty$rect new_theme_empty$strip.text new_theme_empty$axis.text new_theme_empty$plot.title new_theme_empty$axis.title new_theme_empty$plot.margin valid.unit = 3L, class = "unit")

zp1 zp1 zp1 size=factor(size)),pch = 21)
zp1<-zp1+geom_text(data=temp,aes(x=x,y=y,label=name,hjust = 0, vjust = 0))
zp1 zp1<-zp1+theme(legend.position="none")

#print(zp1)
zp1

Passion Driven Statistics – Coursera

The association between the measure of diameter and depth among craters with or without layers.

The association between the measure of diameter and depth among craters with or without layers

Introduction

The study about properties of craters on Mars, created by Stuart Robbins (2011), can allow us to understand crustal properties, surface ages and modification events.

This project is based on a subset of the Robbins Crater Database with  384343 craters. In our project, we studied about 2 main problems:

  • The association between the size of craters (diameter) with depths and number of layers. The craters’ diameters can be larger when the craters have a large number of layers.
  • The depth of craters with layers can have more effect on the diameters than the craters without layers.

We think that a large crater should have high number of layers, therefore they can have a higher depth and also make their diameter bigger. These association effects can be stronger when the number of layers is higher.

Research Question

With that thought in mind, we propose the following 2 questions that could help us to have a better understanding about craters’ properties.

  1.  Is number of layers associated with the measure of diameters among craters on Mars?
  2.  Is the association between the depth and diameters simillar for all craters with or without layers?

Methods

Sample

There are 384343 observations in which 364612 craters (94.8%) have no layers and the others are 19731 (5.2%), were from the Mars Study database.

The Mars Study, is researched by Stuart Robbins, presents a sample of Mars’ craters with their physics properties (e.g. location on Mars, size and depth, 3 kinds of ejecta morpholophy, number of layers).

Measure

Each of craters has Crater_ID which is identified internally based upon the region of the planet. Craters size are shown by DIAM_CIRCLE_IMAGE variables (units are km) which is the measurement of diameter of a non-linear least-squares circle fit to selected vertices on craters rim.

DEPTH_RIMFLOOR_TOPOG is calculated by taking average elevation of the determined N points along (or inside) the crater rim (units are km).

NUMBER_OF_LAYERS is determined by the maximum number of cohesivelayer in any azimuthal direction. There are 6 levels of number of layers (from 0 to 5). The new Layer category variable is symboled as 0 if the craters have 0 number of layers (called craters without layers) and 1 if the craters have more than 1 number of layers (called craters with layers).

In this study, to make it easier in analyzing data, the DIAM_CIRCLE_IMAGE variable is divided into 4 categories and each category has the same number of craters. The new variable DIAM is set values:

DIAM=1 if DIAM_CIRCLE_IMAGE  less than 1.18

DIAM=2 if DIAM_CIRCLE_IMAGE from 1.18 to 1.55

DIAM=3 if DIAM_CIRCLE_IMAGE from 1.55 to 2.55

DIAM=2 if DIAM_CIRCLE_IMAGE greater than 2.55

  • Procedures

The project is analyzed by SAS program version 4.3. The source code is presented later in SAS program post. 

Results

Univariate:  

There are 75% craters’ diameters less than 2.5 km. The mean of diameter (3.5 km) is greater than the median (1.5 km) and the standard deviation is 8.6 km. Therefore, most of craters have small diameters while there are some outliers have significant higher values up to 1164 km.

image

Univariate result for diameter

In general, the depth of craters is almost small with 99% less than 1 km. The maximum value of depth is 4.95 and the minimum is -0.42.

image

Univariate Result for Depth

The diam variable has 4 categories, each of them has nearly the same number of craters.

Nearly 95% craters have 0 number of layers. There are only few craters have 4 or five layers. The number of creaters decreases when the number of layers increases.

image

Frequency for number of layers

Bivariate:

1.  Because the large amount of craters with no layers (94.8%), I can only draw a boxplot figure for craters with layers from 1 to 5. The figure below shows us the box plot of each layer with the diameter of crater.  Noticed that there are small spreads within the layers and these box plots are very little overlap.  

image

                            Boxplot for different number of layers, 1-5

 

 

The bar chart is then used with mean diameters of layers is added. Most of craters with no layer have small diameter.We can see that the mean diameters increase significantly with respect to the increasing level of layers from 0 to 5.

image

                         Barchart for different number of layers (0-5)

As expected, the Anova analysis showed the positive and significant association between the number of layer (category explanatory) with the measure of diameter (quantitative respone), small p-value (<0.0001) and high F value (1494.68).  That is, the number of layers increases leading to the increasing diameters.

image

image

image

ANOVA test for the relationship between number of layers and diameter mean

2. Considering the associations between 2 variables diamter of craters (DIAM_CIRCLE_IMAGE) and depth of craters (DEPTH_RIMFLOOR_TOPOG) in 2 case: subset data on craters without layers (case 1) and on craters with layers (case 2). Based on the below scatter plot, we can deduce that there is association between the diameter and depth of craters.

image

                             Scatter plot between diameter and depth

Indeed, the two Pearson Correlation calculations give us efficient p-value (<0.0001) and the positive correlation numbers.

image

image

image

 

Then we can conclude that there is significantly positive correlation between diameter and depth of all craters. Futhermore, the correlation number in case 2 (0.73) greater than in case 1 (0.61).  However we must notice that there are lot of data in case 1 and the values of craters’depth is distinct too much from the maximum and the rest. Then we can only conclude that there seems to have same deep associations in 2 cases. Futhermore, the craters with layers tends to have higher measurement of depth with higher levels of diameter.

image

Line Plot for the relationship between diameter and depth, with and without layer

Discussion

What might the results mean?

The craters with have higer number of layers will have larger depths.

In high level of diameter, the craters having layer seems to have higher depth than the ones without.

Strength

Results are based on the subset sample of Robbins Crater Database.

Limitation

The number of craters without layer is too large compared to craters with layers. This might cause difficulty when we do test on craters with layers and without layers. Futhermore, it is quite tough to divide range of dimaters since the measure of craters is nearly 75% smaller than 2.55 km and the rest is larger.

Recommend future research 

We can consider the role of craters’ locations impact on their porperties. That is needed to more research about the association between craters’ locations and their depths or diameters.


Reference

image

What surprises me about sport business

Sport Business

Sport Business

I has long been a fan of Manchester United and it is fascinating for me to learn about their business and in general, sport business. So, I followed a course in Coursera, https://www.coursera.org/course/globalsportsbusiness to find out more about this industry.

I always have these big questions:

  • How much do these sport companies earn?

  • How do they justify their big spending on player transfer?

  • What are different source of revenues, like advertising, media streaming, player transfer, youth training and transferring, gate tickets?

  • How can the weak team survive, comparing to big teams like Manchester or Real Madrid?

  • How people make sure that these teams don’t cheat on their games?

  • How about drug testing? (this question is already answered on this blog post)

Their Revenue and Income

Big Sport Teams like Manchester United only earns very little comparing with some giant Internet companies like Facebook or Microsoft. Their players can earn a lot comparing to other companies’ employees, though.

Manchester United market capitalisation is: 3.10 Billion $. Well, that is relatively low, comparing to 1B$ of Instagram or 47$B of Facebook. We will investigate each of their sources of revenues to understand where is the big budget and what the league does to make sure it stays competitive and interesting.

Gate Income
is sometimes shared between home and away team, normally ranging from 0%-33% to the away team. Here is the following list of 10 top gate receipts. This gate income contributes a significant part of the income for the clubs:

1 Real Madrid – 438.6m

2 Barcelona – 398.1m

3 Man Utd – 349.8m

4 Bayern Munich – 323.0m

5 Arsenal – 274.1m

6 Chelsea – 255.9m

7 AC Milan – 235.8m

8 Liverpool – 225.3m

9 Inter – 224.8m

10 Juventus 205.0m

Media Income:

The Media is working really well with sport business as the percentage of audience watching a delayed sport is very low, only 4.4%. The following table shows the proportion of audience that watches a delayed show.

Category

% of audience watching delayed

Sport Events

4.4%

Award Ceremonies

14.7%

Comedies

39.5%

There are different business models in this area: TV Networks, Cable TV Model, Regional Sport Network. They can benefit from both the Subscriber Fee and the Advertising Revenue.

The Media Income of some big sporting event:

- Superbowl: 106 million audience (at year 2010)

- UEFA Champions League: 109 million audience (at year 2010)

Other income:

There are other incomes from training the young players and selling them to other teams. The naming fee for the stadiums and the team. There is also salary cap for the players inside the league as well.

 

Drug Testing and Statistics

Doping

I feel very surprised that people like Lance Amstrong or any other atheletes could by pass the drug test so easily for many years until they got caught. Until recently, I read a blog post about the statistics that reveal the truth about this:

Anti-doping tests have a huge false-negative problem. I have been talking about this for years

As it is a huge false-negative problem, most of the time, the dopers will escape it. But if a player tested positive, it is very high likely that they use doping. It is only somebody suspects and request an official legal and expensive process.

And even if you pass 500 doping test is not an impressive as you thought, more here

The anti-doping agencies are so concerned about not falsely accusing anyone that they leave a gigantic hole for dopers to walk through. . . . While we think about Armstrong’s plight, let’s not forget about this fact: every one of those who now confessed passed hundreds of tests in their careers, just like Armstrong did. In fact, fallen stars like Tyler Hamilton and Floyd Landis also passed lots of tests before they got caught. In effect, dopers face a lottery with high odds of winning and low odds of losing. .

SnappyCam App looks very good

I just read about this on Techcrunch and try it out, it is a very handy camera app. Although I am not sure the image result looks actually great because it is only 213kb? I also would love to know how he can do it

Snappy Camera

Snappy Camera

 

http://techcrunch.com/2013/07/31/fastest-iphone-camera/

Your standard iPhone camera app is actually pretty slow, able to take just three to six photos per second at 8 megapixels each. But with SnappyCam 3.0, you can shoot 20 full-resolution photos per second thanks to a breakthrough in discrete cosine transform JPG science by its inventor. Twenty frames per second is fast enough to capture shot-by-shot animations or every gruesome detail of an extreme sports crash.

Debt, Inventory and Revenue

Inventory

Inventory

Your code is your debt

You spend money, efforts and bug management to control your debt. Code doesn’t automatically generate revenue, user features and user satisfaction do. It doesn’t matter that you write 100 000 lines of code in 10 000 hours and complexity is 1 million (it, well, matters for technical guy) if those efforts doesn’t acquire new users or generate more revenue. It is like saying: I have borrowed 1 million dollars and spent all in this project. It sounds cool but it doesn’t do any benefit to the company. Even worse, it harms the company.

Inventory

Inventory is what you produce but just sitting on some warehouse/storage and does not generate any money. It could be even worse if it costs you any money to store those things.
Like Joelonsoftware said: inventory can happen in each of the following software process, and they can have different results:

      Decision-Making Process: documentation, product backlog, feature ideas…
      Design Process: diagrams,
      Implementation Process
      Testing Process
      Debugging Process
      Deployment Process

Each of stage’s products can never be implemented, get ignored or become unrealistic the next time. Here, we don’t talk about the waterfall process, which could make it even tremendous. For example, the feature backlog that is written in hundreds of pages that 90% are not implemented. The bug database contains all the bugs, efforts to maintain them and understand them but only 10% of them get fixed after a long time.

As with any kinds of inventories, after a while, your products inside the inventory gets obsolete, and needs cleaning up so the new things can be added in. The obsolete inventories will cost you the efforts and time to create it, maintain it and get rid of it. It is the same for software engineer, the bugs that are no longer bugs (after lots of updates), the features documentation that are not compatible with the current products…

It is important for manager to understand about the similarity of the cost, the debt, the inventory and the revenue in a software engineering process. It is easy to measure engineer by how much code they write, but it is the same as measuring how much debt he brings to the team. Higher debt doesn’t mean higher revenue, so be careful.

References:

http://www.joelonsoftware.com/items/2012/07/09.html