|
False security |
| August 5th, 2008 under InfoSec, Digital Rights, OSS, rengolin, Computers. [ Comments: none ]
|
|
False security is worse than no security. It’s that simple.
Bruce Schneier won’t stop saying how CCTV cameras are not only plain ineffective, but they bring the false sense of security even on police forces that won’t patrol the streets as good as they would without cameras. People won’t worry as much as they would without cameras and become easy baits for common robbers.
The same apply to computer security, of course. Building up a firewall in your computer, running an updated version of the latest anti-virus / anti-rootkit / anti-malware / anti-whatever won’t protect you from the most simple of the attacks: social engineering. One email or phone call done right to the right person is enough to render the whole network inoperative for hours or to pass sensible information to black hats do whatever they want or need in order to hack a system. Yours or any other.
As if it was not enough, as Bruce always point out, placing cameras will make robbers attack on places without cameras. In the same line, placing personal firewalls will make viruses mutate and attack on more subtle ways. Placing proxies and snooping hardware on your network will only make the real offenders care more when they’re accessing prohibited websites or protocols, for they will anyway.
The fact is simple: You can’t assure 100% of security.
Money is hardly the issue here. Think on the amount of money the US spend on securing their own classified data. Probably more than what they spend on wars around the world. But it wasn’t enough, Gary McKinnon could get into all of that to search for UFO information (yes, I do believe him). Apple spends a whole bunch on securing their devices and Brazilian hackers unlocked it only 3 days after the new iPhone 3G was released.
DRM is the other myth I can’t understand how people with a bit (not much) of clarity and intelligence can ever think it’s worth the shot. All major locks imposed to consumers were broken immediately after they were released. Hackers (good and bad ones) can easily break into any security scheme but the normal public will have to use the digital handcuffs. It’s not only unfair, it utterly stupid and pointless.
There is no sensible choice other than agree with Richard Stallman’s philosophy: ideas should be open and free. Competitive advantage must be on what you are doing rather than on what you’ve done. It’s impossible to secure the past, let it go, walk forward, invent!
What’s the value (worth of stealing) of your previous achievements if your future ones are much better? What could a hacker possibly want with old things? If they’re hacking, it means you’re not fast enough! Keep up!!
Popularity: 1% [?] Share This
|
|
Why not the primary key? |
| April 23rd, 2008 under Devel, rengolin, Computers. [ Comments: 3 ]
|
|
Recently I came to an amusing situation with Oracle (again) where the primary key was not used when explicitly requested…
The query was:
select name from table where table_id = 1;
Of course, table_id was the primary key. Astonishingly Oracle performed a FULL_TABLE_SCAN.
WTF?
When things like that happen you think there’s something quite wrong with the database, so I decided to ask the DBAs what was going on. The answer was something like:
“It may happen if Oracle decides to. Even though you have created an index, there is no guaranteed they will be used for all queries and the optimizer will decide which one is the best path”.
Seriously, if Oracle decides NOT to use the primary key for that query, there is something really wrong with the whole thing. I couldn’t think of a situation where that might be even close to valid! A friend who knows Oracle much better than me pictured two extreme cases why it could happen:
- There are just very few records in that table -> table data = 1 data block, reading the index root block (1 data block) and then accessing the one table data block is certainly more expensive than just read that one table data block.
- The index is in a Tablespace with different block size which resides on very slow disks. The buffer cache for the non-default block size is hugely under-sized. So the cost to read the index and the table data might be higher than just reading the table data. It’s a bit unrealistic, but I’ve witnessed stupid things like this.
Let’s face it, the first scenario is just too extreme to be true. If you have only one data block on your table you better use email rather than databases. And the second scenario, why would anyone put indexes on a much slower disk? Also, if the index is too big the data will be proportionally big too, so there is no gain in doing a full table scan anyway.
Later on he found out what the problem was by hacking into the configuration parameters:
- The production database (working fine) had:
optimizer_index_caching = 95
optimizer_index_cost_adj = 10
- The development database (Oracle default values) had:
optimizer_index_caching = 0
optimizer_index_cost_adj = 100
I don’t quite understand what they really mean to Oracle but index_caching = 0 seems a bit too radical for me to make it default.
At the end (and after more iterations than I’d like) it was fixed but what really pissed me off was to get the pre-formatted answer that “Oracle knows better which path to take” without even look on how stupid was that decision in the first place. This extreme confidence in Oracle drives me nuts…
Popularity: 10% [?] Share This
|
|
gzip madness |
| April 9th, 2008 under Devel, Unix/Linux, rengolin, Computers. [ Comments: 3 ]
|
|
Another normal day here at EBI when I change a variable called GZIP from local to global (via export on Bash) and I got a very nice surprise: all my gzipped files have gzip itself as a header!!!
Let me explain… I have a makefile that, among other things, gzip some files. So, I’ve created a variable called GZIP that is the same as “gzip –best –stdout” and on my rules I do:
%.foo : %.bar
$(GZIP) < $< > $@
So far so good, always worked. But I had a few makefiles redefining the same command, so I though: why not make an external include file with all shared variables? I could use the @include for makefiles but I also needed some of those variables for shell scripts as well, so I decided to use “export VARIABLE” for all make variables (otherwise they aren’t caught) and called it a day. That’s when everything started failing…
gzip environment
After a while digging the problem (I was blaming the poor LSF on that) I found that when I hadn’t the GZIP variable defined all went well, but by the moment I defined GZIP=”/bin/gzip –best –stdout” even a plain call to gzip was corrupted (ie. had the binary gzip as a header).
A quick look on gzip’s manual gave me the answer… GZIP is the environment variable that gzip stores all default options. So, if you say that GZIP=”–best –stdout”, every time you call gzip it’ll use those parameters by default.
So, by putting “gzip” on the parameter list, I was always running the following command:
$ /bin/gzip /bin/gzip --best --stdout < a.foo > a.bar
and putting a compressed copy of gzip binary together with a.foo into a.bar.
What a mess can a simple environment variable do…
Popularity: 11% [?] Share This
|
|
Serial thinking |
| March 11th, 2008 under Fun, Devel, Algorithms, rengolin, Computers, Physics. [ Comments: 2 ]
|
|
I wonder why the human race is so tied up with serial thinking… We are so limited that even when we think in parallel, each parallel line is serial!
What?
Take the universe. Every single particle in the universe know all the rules (not many) that they need to follow. On themselves, the rules are dumb: you have weight, charge and can move freely round the empty space. But join several particles together and they form a complex atom with much more rules (combined from the first ones) that, if combined again form molecules that form macro-molecules that form cells that form organs that form organisms that form societies etc. Each level makes an exponential leap on the number of rules from the previous one.
Than, the stupid humanoid looks at reality and says: “That’s too complex, I’ll do one thing at a time”. That’s complete rubbish! His zillions of cells are doing zillions of different things each, his brain is interconnecting everything at the same time and that’s the only reason he can breathe wee and whistle at the same time.
Now take machines. The industrialization revolutionized the world by putting one thing after the other, Alan Turing revolutionized the world again by putting one cell after the other in the Turing tape. Today’s processors can only think of one thing after the other because of that.
Today you have multi-core processors doing different things but still each one is doing things in serial (Intel’s HyperThreading is inefficiently working in serial). Vector processors like graphic cards and big machines like the old Crays were doing exactly the same thing over a list of different values and Quantum computers will do the same operation over an entangled bunch of qbits (which is quite impressive) but still, all of it is serial thinking!
Optimization of code is to reduce the number of serial steps, parallelization of code is to put smaller sets of serial instructions to work at the same time, even message passing is serial on each node, the same with functional programming, asynchronous communications, everything is serial at some point.
Trying to map today’s programming languages or machines to work at the holographic level (such as the universe) is not only difficult, it’s impossible. The Turing machine is serial by concept, so everything built on top of it will be serial at one point. There must be a new concept of holographic (or fractal) machine, where each part knows all rules but only with volume you can create meaningful results, where code is not done by organizing the high-level rules but by creating a dynamic for the simple rules that will lead to the expected result.
How then?
Such holographic machine would have a few very simple “machine instruction” like “weight of photon is 0×000″ or “charge of electron is 1.60217646 × 10^-19″ and time will define the dynamics. Functions would be a pre-defined arrangement of basic rules that must be stable, otherwise it’d blow up (like too many protons in the nucleus), but it wouldn’t blow up the universe (as in throw exceptions), it would blow up the group itself and it would become lots of smaller groups, up to the indivisible particle.
The operating system of such machine should take care of the smaller groups and try to keep the groups as big as possible by rearranging them in a stable manner, pretty much as a God would do to it’s universe when it goes crazy. Programs running on this operating system would be able to use God’s power (GodOS libraries) to manipulate the groups at their own discretion, creating higher beings, able to interact, think and create new things… maybe another machine… maybe another machine able to answer the ultimate question of Life, the Universe and Everything.
I know letting the machine live would be the proper way of doing it but that could take a few billion years or I’ll be quite tired of engineering the machine and it’s OS and I’ll just want to the the job done quickly after that…
Why?
There is a big fuzz about Non-Polynomial time problems (NP-complete), those that can’t be solved in a reasonable (polynomial) time. The classic example is the travelling salesman problem where a salesman has to go to each one of a number of cities. Which is the best path to follow to visit all of them in the smallest distance possible? With 3 or 4 it’s quite simple but when you have lots like 300 it becomes impossible for normal (serial) computers to solve.
Another problem quite fancy is the Steiner tree problem, where you have some points and you want to connect them using the least amount of strings. This is as complex as the problem above, can take forever (longer than the age of the universe) for relatively small sets of points, but if you use water and soap the problem is solved almost instantly.
Of course, soap films cannot calculate the last digit of PI but because every part of it know a small list of basic rules (surface tension increased by the soap molecules derived from opposite charges between atoms) every particle of the machine works together at the same time and the result is only achieved because the dynamic of the system has it’s least energy (least amount of strings) in that state.
It’s true that today’s computers are very efficient on working on a wide range of problems (thanks to Turing proving the classes of problems his tape could solve) but there are some that it can’t, given that we only have a few billion years yet of universe to spare. Such problems could be solved if there was a holographic machine.
UPDATE:
More or less what I said was practically applied here. Thanks André for the link, this video is great!
Popularity: 17% [?] Share This
|
|
Bioinformatics and its problems |
| February 21st, 2008 under Devel, rengolin, Computers, Biology, Bioinformatics. [ Comments: none ]
|
|
For the last two months I’ve been writing a text about software quality in bioinformatics and the first part is done: I finally finished the basic concepts and tasks on why and how to perform software quality assurance in bioinformatics.
The big reasons why I focused in bioinformatics are:
- I’m working in a bioinformatics institute
- Bioinformatics has LOTS of problems
If you liked the first part (link just above) or would like to know more about my solutions and ideas keep reading. Use the root link as your entry point and go reading by chapter.
I couldn’t do the next/previous links as the wiki software doesn’t have this automatically and I didn’t want to hard-code it in the text (it’s a software quality text, isn’t it?).
Disclaimer:
- The text was written very fast, you’ll probably find lots of incoherent phrases and grammar errors, ignore them for now as I’m re-reading and re-writing everything.
- I’ve put more than I think I should and am now filtering what’s worth staying. I might also add a few more new things.
- Most code samples won’t work, they’re a simplified language for clarification only.
Do let me know if you think you could add something I forgot or disagree on any concept, the text is in a very immature state yet.
Popularity: 11% [?] Share This
|
|
RDBMS, to rewrite or not to rewrite… I got confused… |
| February 19th, 2008 under Devel, Algorithms, Distributed, rengolin, Computers, Software. [ Comments: none ]
|
|
Mike Stonebreaker (Ingres/Postgres) seems to be confused as well…
First he said Google’s Map/Reduce was “Missing most of the features that are routinely included in current DBMS”, but earlier he said to ditch RDBMS anyway because “modern use of computers renders many features of mainstream DBMS obsolete”.
So, what’s the catch? Should we still use RDBMS or not? Or should we still develop technologies based on relational databases while Mike develops himself the technology of the new era? Maybe that was the message anyway…
My opinion:
MapReduce is not a step backwards, there are sometimes when indexing is actually slower than brute-force. And I’m not saying that on insert time the indexes have to be updated and so on, I’m saying in the actual search for information, if the index is too complex (or too big) it might take more time to search through the index, compute the location of the data (which might be anywhere in a range of thousands of machines), retrieve the data and later on, sort, or search on the remaining fields.
MapReduce can effectively do everything in one step, while still in the machine and return less values per search (as opposed to primary key searches first) and therefore less data will be sent over the network and less time will be taken.
Of course, MapReduce (as any other brute-force methods) is hungry for resources. You need a very large cluster to make it really effective (1800 machines is enough :)) but that’s a step forward something different from RDBMS. In the distributed world, RDBMS won’t work at all, something have to be done and Google just gave the first step.
Did we wait for warp-speed to land on the moon?! No, we got a flying foil crap and landed on it anyway.
Next steps? Many… we can continue with brute-force and do a MapReduce on the index and use the index to retrieve in an even larger cluster, or use automata to iteratively search and store smaller “views” somewhere else, or do statistical indexes (quantum indexes) and get the best result we can get instead of all results… The possibilities are endless…
Lets wait and see how it goes, but yelling DO IT than later DON’T is just useless…
UPDATE:
This is not a rant against Stonebreaker, I share his ideas about the relational model being far too outdated and the need for something new. What I don’t agree, though, is that MapReduce is a step backwards, maybe not even a step forward, probably sideways.
The whole point is that the relational model is the thesis and there are lots of antithesis, we just didn’t come up with the synthesis yet.
Popularity: 11% [?] Share This
|
|
Help us, Obi-Wan Kenobi; you’re our only hope… |
| February 18th, 2008 under Web, OSS, rengolin, Computers, Articles. [ Comments: none ]
|
|
After Yahoo! rejecting MS offer and all the fuzz about Yahoo! takeover now Yahoo! itself is breaking apart…
No wonder the shareholders are mad, Yahoo! has been falling to pieces since Google got into scene and now with the $31 / share offer when it was barely holding it self above $20 the shareholders saw all the return for their investment happening in a very short time, what might be the last chance they have to see any money back at all.
So here’s a bit of futurology:
David Filo moves to Hawaii, shareholders sue Jerry Yang and he’ll end up very poor on his own Caribbean island, Yahoo! is bought by Microsoft by half the price (after the lawsuits there will be few left) and the shareholders will be very happy to, at least, get some money back.
All FreeBSD / Apache / PHP will be converted to Windows 2003 Server / .NET / C# and Yahoo! services will be even worse than they used to be, Microsoft will take the users and force them to start using Google services (no one likes to eat crap anyway) and Google will be the last hope of the Internet.
Fortunately Google is by far more efficient than Microsoft and Yahoo! together (it’s not that hard anyway) and it’ll be piece of cake to take them both down while still holding their hats with the other hand. I just hope Google doesn’t try to dominate the world as Microsoft is attempting for decades, they probably know by now that it’s like reaching the speed of light, the bigger you are the more energy you need to increase speed.
Microsoft and Yahoo! will still exists for a loooong time and Google will have a bit of competition for a while, at least until the “next-Google(tm)” shows up and put all three in the sack “with a wave of her hand(tm)” and the cycle will start all over again.
Let’s hope for the best, whatever that is…
Popularity: 10% [?] Share This
|
|
Who’s the amateur now? |
| January 15th, 2008 under Unix/Linux, rengolin, Computers, Software. [ Comments: 3 ]
|
|
Long way ago, when I started using Linux, lots of people laughed at me: “What an absurd! You have to compile your own kernel, what do they want with that? They’ll get nowhere!”. Well, things have changed a bit in the last decade and Linux grew up as a very mature, modern and user-friendly operating system as we (not them) all expected.
OS companies didn’t believe at start but with time Linux became a nuisance, than a problem and now it’s real competition. Not only Linux (or rather GNU/Linux) but all free software and all the free licenses like GPL, FreeBSD, CC, etc. Linux is real business, it’s more stable, faster, better designed and change so much faster than any other OS in existence both for security patches and new features. Lots of companies today contribute to free software without charge or restrictions, just because we gave them so much without charge or restrictions (and it turns out as profit too!).
But last year something I wasn’t expecting happened… The biggest OS company for the last 15 years did a move so stupid that I couldn’t believe. Windows Vista was not an operating system, it was a joke, a *very bad joke* indeed. It reminded me the first upgrades of the first Linux distros back in 94, it was a nightmare.
Well, seems like the free software community learnt a lot about deployment, user interfaces, quality assurance, software development strategies. On the other hand, Microsoft seems a bit amateurish when trying to fix the previous mistakes. Every round it gets worse, I wonder where the good programmers they use to have are now…
Well, better for us, Ubuntu seems to be the new OS of choice for many previous Windows users and with recent Microsoft moves it may become more and more often… Luckily they’ll force everyone out of XP (the last minimally decent thing they did) as they did to Win2000 (the only reasonably decent thing they did) and people will migrate to Ubuntu instead of Vista… Let’s see the outcome by next year…
Popularity: 14% [?] Share This
|
|
Got the disks? Use your PSP… |
| November 30th, 2007 under InfoSec, Fun, rengolin, Computers. [ Comments: 1 ]
|
|
Finally some good news to crackers that got the HMRC disks, they can now easily crack the password protected spreadsheets while playing Final Fantasy!
Popularity: 10% [?] Share This
|
|
LSF, Make and NFS 2 |
| November 27th, 2007 under Unix/Linux, Distributed, rengolin, Computers. [ Comments: none ]
|
|
Recently I’ve posted this entry about how NFS cache was playing tricks on me and how sleep 1 kinda solved the issue.
The problem got worse, of course. I’ve raised to 5 seconds and in some cases it was still not enough, than I’ve learnt from the admins that the NFS cache timeout was 60 seconds!! I couldn’t sleep 60 on all of them, so I had to come with a script:
timeout=60
while [ ! -s $file ] && (( $slept < $timeout )); do sleep 5; slept=$(($slept+5)); done
In a way it’s not ugly as it may seem… First, the alternative is to change the configuration (either disable cache or reduce timeout) in the whole filesystem and that would affect others. Second because now I just wait for the (almost) correct amount of time and only when I need (the first -s will get the file if there is no problem).
At least, sleep 60 on everything would be much worse! 
Popularity: 9% [?] Share This
|
| « Previous entries |
|
|