‘locate’ on ubuntu with an encrypted home dir

free software,geeky,linux,ubuntu — 5. Oct 2011

recently, i noticed that the locate command on my ubuntu system didn’t work as expected. it simply didn’t list files located in my $HOME dir, while it did still list files in the system directories. it took me a while to figure out that this behaviour was due to the fact that i decided to check the “encrypt home dir” option when i last (re-)installed the OS.

on second thought, it makes sense that it works that way, since the command to update locate‘s database (updatedb.mlocate on ubuntu) is run as a root cronjob, and as such it can’t access the filesystem while it’s encrypted. on the other hand, understanding this requires quite a bit of prior knowledge about how locate works, and i think it’s a bit rough to let the users figure this out all by themselves, without as much as a warning. the situation would be much improved if locate would at least spit out a warning that it can’t access the home dir, instead of the ominous silence, from which we usually conclude that no matching files exist on the disk.

after some googling, i found a good solution for this problem. this guide explains how to set up locate to store a separate, user-specific database inside the encrypted home directory. this also requires a user-specific cronjob. after following that guide, locate once again works just as expected on my system.

welcome to html5

geeky,webdesign — 13. Feb 2010

note: this will work only on very modern browsers (latest firefox, opera, safari). it will work partially on chrome (no audio), and it will not work at all on internet explorer.

update march 2012: major update! the game logic has been re-written from scratch (it was buggy), the game now has 5 levels, a reset button, fading effects, and even one very simple animation :) in addition, audio should now work on chrome.

matching hard- and software

geeky,linux — 20. May 2009

“If a company designs both hardware and software,
it can build much better systems than if they only design the
software. That’s why Apple’s iPhone is so much better than
Microsoft phones.”

this statement comes from larry ellison, oracle’s CEO, in a recent reuters interview.

what he says is a simple truth, almost trivial, yet it can’t be stressed enough how significant it is. while i personally couldn’t care less about phones, the statement holds of course just as well for desktop / laptop computers. frankly, both windows and linux desktop OSes work crappily on many computers today. you will get devices without proper support, driver issues, incompatibilities between components and all sorts of other problems. and this problem will never go away as long as the hardware and the software are not engineered together. there are literally quintillions of different PC devices / components out there today, and there is just no way any OS could ever support all of them – and all combinations of them – equally well.

the solution, then, is to buy hardware and software that comes from the same company and has been designed to work together. both windows and linux fail in this regard, only apple (and sun) get this right as of today. and this is IMHO the main reason while apple is so successful these days. it’s just not possible to get the same stability and reliability with an OS that is supposed to work “on any PC hardware”.

hopefully, we will get linux computers at some point in the future that are engineered in this way. the company could make money from the hardware, and the software could still be free/open source. i at least would be happy to pay the extra charge.

google’s secret

geeky — 23. Dec 2008

google got all fat & rich because of one single reason: they excel at sorting search engine hits. the secret behind this is called PageRank, a clever algorithm to sort links. they even explain the basics of how that works on their homepage. how nice of them.

but hold on a second. if all of their success was based on one single business secret, why in the world would they be talking so openly about it on their webpage? granted, they don’t give any implementation details there, but still, why put the competitors/cloners on the right track with information about the way PageRank works?

personally, i think it’s a priori much more likely that the information about PageRank on the google homepage is deliberately misleading, so as to fool the competition (and the general public…).

but what else could be google’s secret, then?

well, try this: go to google and search for ‘antoine meillet’ (feel free to use your favourite linguist instead ;). look at the results and hover over the links. you will see nothing strange: the full URL in the status bar will indicate that they are direct links to the target webpage.

or maybe not? check out the source code of the google result page (on firefox, select the text and choose ‘view selection source’ from the right-click menu – god i love that browser). surprise, surprise: the href attribute does in fact not show a direct link to the target (we expect: href=”http://en.wikipedia.org/wiki/Antoine_Meillet”), but instead shows something really cryptic like:

href=”/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fde.wikipedia.org%2Fwiki%2FAntoine_Meillet&ei=IzJQSaP3JY_m0gXV7KWEBA&usg=AFQjCNFHbV4saUM80cY7BQ6pfEYNAGnD0A&sig2=a9sZLl2cPM-V7WC5dGs–A”

now what exactly that means (and how they manage to still show the simple address in the browser’s status page) is a mystery to me. but what seems quite obvious is that people are in fact never directly forwarded to the target site when they click on the google search hit. instead, it looks like they are secretly routed back through google HQ. now since it’s rather difficult to figure out what exactly happens if you click on that link (too much javascript involved…), it’s easier to just log the browser’s activity to see what’s going on behind the scenes. sounds like a job for wireshark (good thing i use linux, where great tools like this one come included!).

here’s wireshark’s list of the all HTTP/GET requests that happend on my network interface when clicking on one of the links from google’s search result:

notice that the first HTTP/GET request went back to google! only the second (and the following ones) went to wikipedia:

with the help of wireshark, we have therefore confirmed that the users are being routed back through google before they reach their actual destination, so the question is: what could be the purpose of this? simple: google will store the key words of your search along with those hits from the search result that you actually clicked on, trusting that you will look through the list of results that google presents and choose the relevant hit(s) from among them. with this information, they will increase the ranking of the hits you clicked on, and decrease the ranking of those hits which you skipped, all in relation to your specific search keywords.

if true, it means that google does in fact let the users do the sorting for them. humans are much better at sorting out relevant hits from among a mass of unrelevant ones, and since google has the possibility to collect that information, why not use it to improve the ranking? it seems not too far fetched, then, to suspect that the core of PageRank is in fact not a fancy algorithm at all – but that it is simply a clever way to let the users rank the search results for them, by (secretly) collection data on which hits the users clicked on and which ones they didn’t.

ps: further investigation showed that the cryptic links in the source of google’s search result pages are not always there. but even in those cases, wireshark shows a HTTP/GET request sent to google before the loading of the actual link target.

« Previous Page