The amazing Genetic Algorithms!

Why do we say Data Science? is this a part of the Science? if you think as an Statistics or Engineer it’s difficult to understand.
One of the most beautiful things in the earth is the nature. We don’t think about that but we are rounded about nature inspired objects, for example planes (like birds), buildings (like hives) or submarines (like whales). When we talk about the Computer World we have also looked at the nature world to learn how to find the best solutions to the most difficult problems.
In the Data Science Universe, more concretely in the algorithm side, we have interesting nature oriented solutions from fields like Neural Networks or Genetics or Swarm Intelligence.
Is really interesting thinking in how to find algorithms that emulate Neural Networks to solve daily problems we have, or look at the bees or ants to use Swarm Intelligence and replicate this behaviour to apply solutions in Healthcare or Public Administration to improve the quality of live of the people. This is my main objective try to give VALUE to the society and through the Data Science Universe I believe that it’s a reality!
Here you find examples of biological systems that have inspired computational algorithms.
Also in this blog you find more detail with examples around Algorithms in Nature.Turnig back to this post I focus in the  Genetics Algorithm, in other posts I will talk about other Nature Algorithms.

Computer science and biology have enjoyed a long and fruitful relationship for decades. Biologists rely on computational methods to analyze and integrate large data sets, while several computational methods were inspired by the high-level design principles of biological systems.
Biologists have been increasingly relying on sophisticated computational methods, especially over the last two decades as molecular data have rapidly accumulated. Computational tools for searching large databases, including BLAST (Altschul et al, 1990), are now routinely used by experimentalists. Genome sequencing and assembly rely heavily on algorithms to speed up data accumulation and analysis (Gusfield, 1997; Trapnell and Salzberg, 2009; Schatz et al, 2010).
Computational methods have also been developed for integrating various types of functional genomics data and using them to create models of regulatory networks and other interactions in the cell (Alon, 2006; Huttenhower et al, 2009; Myers et al, 2009). Indeed, several computational biology departments have been established over the last few years that are focused on developing additional computational methods to aid in solving life science’s greatest mysteries.

Read the rest of this entry »


But, What is a Data Scientist?

I have started explaining how to become a Data Scientist, but … What is a Data Scientist? Is there an Official University or Professional career with this name? Do you need a Certification to show your skills as a Data Scientist? What are the main skills that you need to be named as a Data Scientist?

Data Scientist is one of the latest emerged jobs and maybe the sexiest job of the century as you can see in detail in this Forbes article.

A really funny description of Data Scientist I have founded in a tweet: “Person who is better at statistics than any software engineer and better at software engineering than any statistician”.

Then, do you need to study Statistics? Do you need to study also Software Engineering? Is there a career about Data Science?

There are a lot of courses, masters and technical trainings you can find in internet. For example in Barcelona there is an interesting Data Science Master awarded from the Graduate School of Economics, the University Autonomy of Barcelona and the Pompeu Fabra University. These universities understand that this program will be for:

  • Graduates in Economics and Business with solid background and keen interest in quantitative methods
  • Graduates in Statistics, Mathematics, Engineering, Computer Science, and Physics with the ambition to work with real-world problems and data
  • Programming professionals who want to acquire analytical, quantitative tools to leverage their experience
  • Aspiring PhD students looking for rigorous training in quantitative and analytical method

Read the rest of this entry »

How to become a Data Scientist

When I was studying Statistics I though that this was the Degree with more opportunities in the future. But when I started working I saw that maybe I was wrong. In fact my first work was as a Java Junior Developer, nothing to do with my studies. More or less 15 years later this could have changed.

The evolution of the Technology has increased exponentially the power of the computers. To be simple, in the area of Data Management, computers let us two main things:

  • One is store all data, here the challenge is clear if we think that every 60 seconds Google receives over 4,000,000 search queries, YouTube users upload 71 hours of new videos, Pinterest users Pin 3,472 photos, Facebook users share, 2,460,000 pieces of content and Twitter users share 277,000 tweet (Infographic How much Data is Generated every minute).
  • And the second, now we can explode this data applying all kind of new and advanced data management techniques like algorithms for predicting patterns or using parallel processing with Terabytes of data to extract and process the valuable information. For example we can talk about Genetics Algorithms (GA)  that use the nature to find the best solutions, you can find one simple exercise of GA in the R-bloggers site using R.

The evolution of the Technology let us connecting all kind of devices with sensors, and these sensors transmit all kind of data through internet to act depending of the data processed. This is Internet of the Things or IoT, in Europe there are some initiatives  that promote the IoT world with lots of resources, guides, subventions, … (Internet of the Things Europe Initiatives). These connections will produce huge quantities of Data then we will need Petabytes of storage, the best Computer performance and the most advanced Applications to process the Data. Here again the two constants: storage and the data management techniques.

Thanks to this evolution we are changing the world of the Public Sector applying this tech to Smartcities, eHealth, Agriculture,etc. In the case of Smartcitiy we can find Barcelona which is the first in Spain and the fourth in Europe with projects like Intelligent Traffic Lights or Apps4bcn.

Thanks to my past as Statistician and the new era of Data Management I have started a new hobby several months ago “Data Scientist”. My curiosity started in Coursera with this course about Machine Learning done by one of the Co-Founders and Chairman of Coursera the Data Scientist Andrew Ng.This course was a little bit intensive for me and I couldn’t dedicate the time you need for learning this fantastic material, I hope turn back in the mid future.

Featured image

Read the rest of this entry »