|
Silly prototypes of the week |
| August 7th, 2008 under Devel, Algorithms, rengolin. [ Comments: none ]
|
|
This time I don’t have a project, nor probably will have time to implement them (I’m on real holidays!), so I’ll just post the ideas and welcome any comments on the feasibility or usefulness of them. Feel free to implement them and let me know how it went.
Independent Bayesian Agents
An Artificial Neural Network is a collection of neurons connected in meaningful ways that, when the information, coming from an input layer, passes through them, a resulting signal is produced on the output layer. The learning is done via feedback (good or bad), by changing the internal functions of each neuron to accommodate them for the next “pulse”.
The problem with this is that, after the learning phase, the resulting network can only classify a strict class of problems. Networks created to recognise alphabetical letter in an image has a different format and components than those to recognise other patterns. There are some implementations where the network itself can change it’s connections in order to learn something new, but that’s not commonly seen, not even on many scientific works on the subject.
You can have the same network structure using Bayesian nodes. The class of problems to be solved is a bit different but the overall logic is similar.
Bayesian Agents are a bit different. They are independent, learning programs that might be able to communicate with other agents upon necessity and also update their internal states from feedback and previous probabilities. It also has the advantage to return not only true/false values but probabilities or even probability distributions to the user.
But this way, your agents will be much more complex than the neurons on ANN, and that’s a path that is rarely worth taking if you want to build emergent behaviour as it becomes more and more difficult to combine them to produce unexpected behaviours. So, the idea is to break all the agents into very small pieces that can interact with each other, with just a few rules.
This way, the agents themselves would be pretty much like objects in an OO design, following the implementation of a class (agent definition) and inheriting or using methods from other Agents (interaction) you can, at the same time, build a network with any format you want (using graphs or even network connections via name service or so for the interactions) and keep it simple, by following the agent definition and not trying to put any complex logic on each agent.
Same as OO design, you don’t want to create one single class that does everything. What you do is a relationship between classes that express the problem set you’re solving and, if well designed, you can reuse some of the components to solve other problems as well. The power of the OO design is the relationship between the classes and not what the classes are actually doing, right?
So, my hunch is that, if we do very simple agents and let the network itself be the relationship between them, we acquire a new power of extensibility.
Also, by changing the connections, the learning happens at the network level instead of the agent level. Further on, by inflicting an “explorer behaviour” on each agent (like extending from a common class Explorer for all agents), they can search for new answers once in a while, in background, to increase their levels of trust.
In the same line, by extending from an agent Feedback you can make all agents be able to process feedback from the users, from its derived classes such as BooleanFeedback returning good or bad, or MultipleChoiceFeedback, ContinuousFeedback and so on.
Genetic programming using template Policies
Learning by structure given above (network structure) is similar to genetic algorithms, but in a different level. Genetic programming allows you to learn by the structure of the program itself. A standard way to solve that is using a rulebase approach. Create a big set of simple rules and write programs to use them in a mixed fashion.
Given the problem you want to solve, run all programs against it and try to define the quality of each solution (if they’re at least able to get somewhere) and give those programs closer to the solution a higher chance of survival, but don’t kill the unfit, and that’s an important step.
If there are good, but dormant, rules (genes) on the unfit and you kill them, you’re removing from genetic pool (rulebase) some solutions that could be the best when joint with others that you didn’t have the chance to join yet. Anyway, crossing those programs (by exchanging rules and creating new combinations) and running the next breed on the same problem you can, iteratively, lead the evolution towards the best solution by chance.
So, what about template policies?
C++ templates are very powerful. One of the great powers (which comes with great responsibility) is to be able to inherit functionality from the template argument.
template <class EatPolicy, MovePolicy, ReproducePolicy>
class LifeForm : public EatPolicy, MovePolicy, ReproducePolicy {
};
With this, you can create a life form following one of the policies provided. They will inherit the methods and properties of the policies as if it was a normal OO inheritance. So, to create a bacterium, just implement some policies inheriting from EatPolicy, MovePolicy and ReproducePolicy and instantiate like this:
LifeForm<Fermentation, Osmosis, Mitosis> bacteria;
You can also create several policies inheriting from Fermentation (from different materials, etc) and create several types of bacteria, but the problem is how to cross them later. Because templates are evaluated during compile time (and types are created), you can’t create new types during run time and the crossing over should be done off-line (and recompile). A few pre-processor commands and a Perl script could do it, though, and isolate that nasty part from the rest of the code.
I’m not sure of the implementation details of the cross-over but the basic skeleton did work: Skeleton of genetic policies. Would give it a try later on.
Popularity: 1% [?] Share This
|
|
Convergence |
| August 7th, 2008 under Technology, rengolin, Hardware. [ Comments: none ]
|
|
I’m not into writing generic posts about buzzwords but I have to admit that I’m astonished. We just got a Nokia N95 8GB last week and I’ve been playing with it since then.
It’s amazing! I’ve never used my laptop since I bought it… It has a decent web browser with Flash Lite support. All websites work perfectly: Gmail, Google, Youtube, BBC, The Register, Slashdot. Even the Google calendar works well. But the best thing is that it has wireless 802.11g support, so I don’t even have to pay a penny to browse the internet! And all major websites have their mobile versions as well automatically chosen by user agent.
Games? It supports the n-gage engine (as all new Nokia do) and has some decent games for it’s small screen. The best thing about the games is that they use a OpenGL 1.1 ES GPU on the thing. Skype, messenger, GPS, step counter etc are also available. Can it get better?
Yet, the best thing is that you can develop in C++, Java and Python! Not on the move, though… I guess I’ll have to wait a few more iterations to get a terminal and a (free as in speech) compiler and a good processor to be able to compile code while I browse the web and listen to podcasts…
Popularity: 1% [?] Share This
|
|
False security |
| August 5th, 2008 under InfoSec, Digital Rights, OSS, rengolin, Computers. [ Comments: none ]
|
|
False security is worse than no security. It’s that simple.
Bruce Schneier won’t stop saying how CCTV cameras are not only plain ineffective, but they bring the false sense of security even on police forces that won’t patrol the streets as good as they would without cameras. People won’t worry as much as they would without cameras and become easy baits for common robbers.
The same apply to computer security, of course. Building up a firewall in your computer, running an updated version of the latest anti-virus / anti-rootkit / anti-malware / anti-whatever won’t protect you from the most simple of the attacks: social engineering. One email or phone call done right to the right person is enough to render the whole network inoperative for hours or to pass sensible information to black hats do whatever they want or need in order to hack a system. Yours or any other.
As if it was not enough, as Bruce always point out, placing cameras will make robbers attack on places without cameras. In the same line, placing personal firewalls will make viruses mutate and attack on more subtle ways. Placing proxies and snooping hardware on your network will only make the real offenders care more when they’re accessing prohibited websites or protocols, for they will anyway.
The fact is simple: You can’t assure 100% of security.
Money is hardly the issue here. Think on the amount of money the US spend on securing their own classified data. Probably more than what they spend on wars around the world. But it wasn’t enough, Gary McKinnon could get into all of that to search for UFO information (yes, I do believe him). Apple spends a whole bunch on securing their devices and Brazilian hackers unlocked it only 3 days after the new iPhone 3G was released.
DRM is the other myth I can’t understand how people with a bit (not much) of clarity and intelligence can ever think it’s worth the shot. All major locks imposed to consumers were broken immediately after they were released. Hackers (good and bad ones) can easily break into any security scheme but the normal public will have to use the digital handcuffs. It’s not only unfair, it utterly stupid and pointless.
There is no sensible choice other than agree with Richard Stallman’s philosophy: ideas should be open and free. Competitive advantage must be on what you are doing rather than on what you’ve done. It’s impossible to secure the past, let it go, walk forward, invent!
What’s the value (worth of stealing) of your previous achievements if your future ones are much better? What could a hacker possibly want with old things? If they’re hacking, it means you’re not fast enough! Keep up!!
Popularity: 1% [?] Share This
|
|
When the hunter becomes the hunted |
| July 22nd, 2008 under InfoSec, Technology, rvincoletto, Articles, Sponsored. [ Comments: none ]
|
|
The fast evolution of computer networks brought fantastic developments for communication and connection capacities.
We can easily see this evolution while observing the Internet, first a restricted network and now a complex and global network, where we can do a simple mail exchange or complex and elaborated financial transactions.
But, we also have the dark side of this fantastic environment: threats like virus, worms and Trojan horses, scanning, spoofing, sniffing or snooping, and so many others became the nightmare of all organizations.
Indeed, the technology can play for and against us.
A good way to make the technology works for us is using Packet Inspection. This is a tool frequently used to sniffing networks, looking for password and breaches, but information security professionals can use it to do exactly the opposite: protect the network.

With a good Packet Analyzer you can generate information about your integrated information systems, supporting the system administrator to find and solve the problems in a quick and efficient manner. It’s possible to identify attacks, non-authorized access to systems and malicious behaviors. In other words, with a good inspection solution your organization will be able to see and analyze everything that hits your network.
You can prevent problems and also reconstruct network sessions, providing the needed information for Network Forensics. It’s when the hunter became the hunted: you will be using the same method malicious threats use to put your business under risks to defend your organization.
Do you want to know what a Packet Inspection is? Watch this video for more information: Deep Packet Inspection explained or read here at Wikipedia.
Popularity: 3% [?] Share This
|
|
Silly project of the week: molecule dynamics |
| July 9th, 2008 under Devel, Algorithms, rengolin, Physics. [ Comments: 1 ]
|
|
This week’s project is a molecular dynamics simulation. Don’t get too excited, it’s not using any of the state-of-art algorithms nor is assembling 3-dimensional structures of complex proteins. I began with a simple carbon chain using only coulomb’s law in a spring-mass system.
The molecule I’m using is this:

The drawing program is quite simple and wont work for most molecules, but for the 2-dimensional simple molecules (max. of 3 connections per atom) it kinda works.
Later on, putting the program to run, each atom “pushes” all others electrically and the spring “pulls” them back. A good way to solve that is to say that q1 . q2 / x² = - k . x = m . d²x/dx² (where x is a vector) and integrate numerically using Runge-Kutta.
But that’s my first openGL program, so I decided to go easy on the model and actually see it pseudo-working with an iterative-based simulation following the same equations above. This picture is a frame after a few iterations.
Quoting its page: “As this simulation is not using any differential solution, the forces grow and grow until the atom becomes unstable and break apart. Some Runge-Kutta is required to push the realism further.”
UPDATE:
The webpage of the fully-functional prototype is HERE.
Popularity: 4% [?] Share This
|
|
Proprietary Software |
| July 3rd, 2008 under Digital Rights, OSS, rengolin. [ Comments: 7 ]
|
|
I’m a big advocate of free software, highly active on the Anti-DRM campaign and a big fan of Richard Stallman (as you can see by reading back lots of posts on this very blog). In his last text to the media about Bill Gates’ retirement, he says (as usual) some very strong arguments about fair societies, freedom of use and copy etc. We all know that, right?
Well, there is one thing I don’t particularly agree: proprietary software.
In a recent talk, he said there was a fair reason why there is copyright: Investment in technology. In the old days, it was the press. Today, we have software companies.
The beauty and the beast
Microsoft, as he said (and I reiterate), only abused of development made by other companies since their first product. Worse, since then, they’ve been buying one company after the other and scraping each one of them (pretty much like Yahoo! is doing recently, therefore the interest). But there are lots of others that are doing fine, and it’s not fair to put them all in the same box.
Adobe Photoshop is a great example. Gimp is fantastic, of course, but the investment in Photoshop is huge and there is a clear difference. The cost is high, but the quality is also high. Like Photoshop, many other specialist software in music, video, animation, scientific, electronic, games and so on have a specific market, to which they belong and are doing pretty well. I’m not saying Adobe (or any other specialist company) is fair, just that some are investing seriously in development, not only sucking their users money and freedom.
Windows is unfair, it locks the user, it treats them as liars, cheaters, yes. Worse still, you can’t use it with anything else because it’s forcefully incompatible with the rest of the world, yes! They’re cheating by making you buy their license even if you’re not using, to force you update Internet Explorer even if you use Firefox, to report all your actions to Microsoft and god know what more. YES!!
Apple with horrible DRM locks, pushing iPhone updates and all we already know they do, Warner, Sony and all the like. Yes! They are mean! But that doesn’t mean all companies are.
Research and Development
If you have a free software (open source) that is enough for your uses, or you can hire someone to increment or adapt it to your needs, fine! If you can write software to your needs and redistribute it to the rest of the world, perfect! But why negate the existence of fair research and development, I don’t know.
I’ve been on the academia side of development to know very well what happens here: some PhD writes a piece of software, without any care for quality or extensibility. Later on, someone (or themselves) make it open source and people start using it, extending it. But most of the time it’s not possible to carry on incrementing, its need a re-write. And people re-write software fortnightly on academia.
The investment is in giving PhDs a good time and not to produce good software. Free software is good not because of that investment, but because people that need it, do it. It’d be fantastic if academia could teach them about software quality, if there was a real control over what they produce (like acceptance by the open source community) as part of their grades.
Now, private companies (like many around Cambridge) invest a good bunch of money in research and development, hiring those same guys and giving them a proper training in software engineering and getting things done, very well indeed. That costs money, I can’t see how they could open the source, at least not in the first years of sale.
Extensibility
Some companies give it for free (as in beer) for academic institutes. But the most important (IMHO) is to be extensible and to have a clear interface. Good software, even if closed source, have a clear and easy-to-use interface. With that, you can extend it to suit your needs. It’s not as good as having the source, but it’s a start.
Enforcing DRM locks, spying on users, making impossible to connect to other software, being nasty is the problem, not being proprietary.
Popularity: 5% [?] Share This
|
|
What is bioinformatics anyway? |
| June 24th, 2008 under Devel, rengolin, Bioinformatics. [ Comments: 3 ]
|
|
After almost three years in the field I’m pretty sure I have no idea. A few months ago I though I knew and wrote an essay about software quality on bioinformatics but I now figured out that, even though those things might make sense to the rest of us, for bioinformatics it doesn’t.
Wikipedia (which have a much higher quality than many papers I’ve read) defines software as: “a general term used to describe a collection of computer programs, procedures and documentation that perform some tasks on a computer system”. It also defines programming as: “the process of writing, testing, debugging/troubleshooting, and maintaining the source code of computer programs”.
So, every one that writes programs (let’s forget about documentation, tests, maintenance etc for now) is a programmer. But a computer programmer IS NOT a software engineer. Programmers can write as much code as they want but without formal definitions, metrics, good design decisions and practices, tests, documentation and so on, they are useless as ants without pheromone.
Quick tip: Whenever you see a job for a software engineer in a bioinformatics institute, beware: It generally means a developer to maintain random code and make random changes in random environments.
So what?
I might not have a clue about what bioinformatics is, but now I’m pretty sure what it ISN’T: Software Engineering. You will find a huge amount of code, scripts, programs, databases but rarely find a fair piece of software. Therefore, my previous ideas could be valid for software quality, but not at all to bioinformatics.
Don’t get me wrong, I know some bioinformaticians (and programmers around) that understand the basic ideas about software and quality and why we should have them, but the whole structure, the scientific community, the people that give them money, have no idea whatsoever of what software really is or where it fits in the loop.
Still, bioinformaticians are getting half-programming and half-biology degrees, on two fields that each has more to know than the whole humanity can hold on their brains added up. How is it possible (and fair) to put those poor guys to work on such sub-human conditions, without any guidance or quality control, without any clue, in fact, to what they really should be doing in the first place.
Some of them come out pretty well, so well that they abandon the field and go work on better companies, with much better software strategies, proper engineering, scientific development in the right place (sandboxes) and production code done by real engineers with solid experience in mission-critical environments.
In the end, it leaves bioinformatics (to be fair, the informatics part only) in the hands of inexperienced people in all sorts of fields and levels, students writing production software, people that never saw a mission-critical environment coordinating databases, filesystems and development, with one bad decision after the other.
Is it just a rant, then?
No, not really. It’s a liberation. For a while I struggled to understand the motives behind those weird decisions. I knew that, in every industry, you have a whole set of values and people can, sometimes take completely awkward decisions, which turns out to be the right one. I’ve seen it happening when moving between jobs, especially when I worked at Yahoo! (big company, big culture). But with time, the awkward decisions still sounded awkward, even after considering all the new information I had.
Other people got fed up with all this and left, one after the other. I talked to them, and the answer was always the same: random (generally bad) decisions, ego in astronomic proportions and zero technical knowledge from all parts. Now I’m leaving for good and you won’t need to ask me why, will you?
I generally need a very good reason to leave a work place. I was feeling out-placed but couldn’t leave without a very good reason, but now I got a good bunch of them…
A liberation indeed!
Is there a way out?
Seriously, no. In 10 years definitely no. In 15 quite likely no. In 20, maybe… but things must start changing now!
Being optimistic, assuming they stop running like headless chickens, they would still need a strong guidance, which is virtually impossible to happen because of the strong ego of scientists in general. Bioinformatics exists for decades already, who is the software engineer that will tell them they’re doing all wrong?
Besides, the people that grant them money (governments) have no clue about software engineering (nor they should) and they will keep sending money every year, as long as, in the reports, they pretend to be doing great things. In fact, most could’ve been done in a few weeks with two or three people prepared to compromise.
Who doesn’t want a job where they can do almost nothing at all, get paid every month without even the remote fear of loosing their jobs and still pretend they’re doing great things? Who say no to this and start working for real gets a really bad reputation… While this win-win situation keeps going, there is little or zero chance of doing real stuff in the field and bioinformatics is doomed to constant failure and ineffectiveness.
At last, it’s not a specific problem, where you can just change a couple of people and everything will be all right, as many believe. This is nobody’s fault, it’s just the way the two fields: biology and informatics, joined together some decades ago and was never straightened. If there is a way out, I’d be very glad to see and will congratulate those who managed to do it, but this is much more politics than software development and I am, very luckily, just a programmer…
Popularity: 5% [?] Share This
|
|
Music Evening 2008 |
| June 24th, 2008 under rengolin, Music. [ Comments: 1 ]
|
|
Last week was our annual Music Evening and our jazz band haven’t played any jazz, unfortunately… But we did play some songs to match our new singer:
And now for the new stars:
Don’t expect too much, we’re not that good, we didn’t have time to rehearse, the equipment was weak (voice and piano too low) and the camera was not even a video camera… but, have fun!
Popularity: 5% [?] Share This
|
|
Long live open source |
| June 20th, 2008 under OSS, rengolin, Software. [ Comments: none ]
|
|
Another fine example about how the open source community can be impressive, even when comparing with the biggest software companies.
Yesterday we had a gig at our annual Music Evening and I needed to edit the videos to upload them on my wife’s website. I go on cinelerra’s website download page and get the Ubuntu 8.04 repository, update the package listing and try to install cinelerra:
sudo apt-get install cinelerra
It should be that easy but unfortunately the repository had an error:
Err http://repository.akirad.net akirad-hardy/main akiradnews 20080417
500 Internal Server Error
Err http://repository.akirad.net akirad-hardy/main libguicast
1:2.1.0-1svn20080530akirad1
500 Internal Server Error
Err http://repository.akirad.net akirad-hardy/main libmpeg3hv
1:2.1.0-1svn20080530akirad1
500 Internal Server Error
Err http://repository.akirad.net akirad-hardy/main libquicktimehv
1:2.1.0-1svn20080530akirad1
500 Internal Server Error
Well, with nothing else to do about it, I’ve followed the instructions on the website saying to email the guy that put the packages in place, which I did. Seriously, I thought it would take a while (days?) until the guy could have time to go home, do whatever he wanted to do at home, check his emails, talk to the ISP, bla bla bla.
To my surprise, after exactly 1 hour and 20 minutes he replied (in English and Italian) that the packages were reloaded, I should be able to get it, which I did, and indeed, worked absolutely fine. I now have my videos edited.
The “guy” was actually Paolo Rampino, which I thank him very much, but also I’d like to acknowledge once more the power of the open source community. I wonder if I had any much more serious problems (security) with a copy of Windows or Office if Microsoft would take 1:20 hours to not only answer, but to fix it!
Thanks again Paolo, you made another user very happy indeed.
Popularity: 6% [?] Share This
|
|
Markov chain available for NumCalc |
| June 13th, 2008 under Devel, Algorithms, rengolin. [ Comments: none ]
|
|
NumCalc is my personal numerical methods program where I’ve implemented some nice algorithms for numerical computation. The new in the list is Markov Chain.
The Wikipedia article (link above) is far too complex… I’ll try to give a simplified explanation:
A travelling salesman goes back and forth in a set of cities and, given the city he is currently in, you want to know what’s the next city he’ll travel. Of course, he won’t show you his travel itinerary.
The simplest way of doing it is to record all travels he does within time. For each city, you have a counter of how many times he went from each city to all other. If you think these numbers as a portion of all the travels from each city you have a probability of going to any other city in the list.
Example: When he was on Paris, he went 3 times to London, 2 times to Amsterdam and only 1 time to Milan. It means that, 3 out of 6 times (50%) he went to London, so the probability of going again is 50%.
For such small quantities it’s weird to assume that the behaviour will be always the same (he can go to new cities as well) but when the amount of statistics you have is big, the behaviour become very repetitive and thus, predictable.
Real Cases:
- MegaHAL uses an advanced Markov model to create chat bots by replying what people said before based primarily on the sole probability of one word coming after the other.
- HMMER is hidden Markov model (a Markov model to predict another Markov model to predict something else) that can do powerful searches within long and scrambled sequences of proteins and genes. The IntrePro group use it to find their protein matches against UniProt.
Of course my super-simplified model is far from being that efficient and useful, but it’s a good start to understand how simple and how powerful they are. You can download it from its webpage.
Popularity: 7% [?] Share This
|
| « Previous entries |
|
|