two small git tips

the impact of git and github in the programmer community continues to amaze me. it appears that almost all major open source projects have made the switch to git at this point. naturally, i got interested too and started using git for my latest software project. however, coming from subversion, i found it not that convenient at first. here are two small tips that helped me become more comfortable with git, effectively making it behave more like subversion.

shorter status messages
the first one is about the status command. i find the default output of git status overly verbose. luckily, there is a more concise version: git status -s (for “short”). its output looks similar to the output of what you get from subversion. run the following command to configure git to always show the short status report:

git config status.short true

commit in one command
the second tip concerns the committing of changes. the normal workflow of making changes and committing them to the repo requires only two steps with subversion (modify a file, commit), while we need three steps with git (modify file, stage (“add”), commit). it seemed unnecessary and inconvenient having to issue separate stage and commit commands for the same action that required only one step in subversion. while there may be good reasons why git keeps those two steps apart, i wished for a shortcut. and in fact, there is one. if you use commit with the -a option (for “all”), it will directly commit all changes without the need for staging them first. a typical command may look like this:

git commit -am "implemented XY"

with these two measures, i found git to be much more usable.

web-based RSS reader alternatives

geeky,life — 21. Apr 2013

the announcement that google will be closing down its RSS reader didn’t go down well with a lot of people, including me. i use the RSS reader daily. i have subscriptions to ca. 30 feeds, mostly for general news (newspaper), tech/linux news and a few for work (some academic journals provide RSS feeds, i also have set up notificiations for a few web pages like job advertisement pages at universities etc. through page2rss ). i know we have no right to complain when someone stops providing a free service – still, it is annoying because it means that i have to look for an alternative. an essential criterion for me is that it needs to be web-based, because i use it from different computers and it needs to stay synced. the most promising alternative so far seems to be the old reader. i really like the minimalist approach. however, it seems it’s more or less a hobbyist project and we’ll have to see how they handle the large influx of new users. i’m also not convinced with their update speed yet. they say that they refresh the RSS sources at least once a day. but once a day is too slow. with google reader, i’m used to get updates within ca. 15 minutes. another alternative is feedly, but it requires a browser add-on and looks bloated. actually, i have more hope for a new reader that is being developed by digg. let’s see what they can cook up.

the question remains why google is suddenly closing down the reader. in the announcement, they are hinting at the possibility that it has something to do with their ongoing attempt to concentrate their activities around google+. but of course social media sites are not a full replacement for a full-fledged RSS reader. google also mentions that usage of their reader has been declining. i have a hard time believing that there is not enough demand, though, when i look at the user stats of the old reader before and after google’s announcement. this is more likely to be a business decision. google’s core business is ads. that’s why they don’t want you to read the news inside the RSS reader, they want you to read it on the webpage itself, where you’ll see the ads. of course in reality that won’t change anything because people will simply move to another RSS reader. besides, any sane person has an ad-blocker installed anyway.

update 30. july 2013:
google reader has been dead now for ca. 1 month. i’ve since more or less settled on using the old reader, however that was not a very long lasting solution, because they just announced that following a disastrously failed storage migration, they are going to close the service down to the public. well…
the good news is that in the meantime, the digg reader has been released as well, and i really like it. it’s quite basic but gets the job done, just what i need. so unless there are any bad surprises with the digg reader in the future, it looks like it will be my RSS reader of choice from now on.

update 9. october 2013:
the old reader has been brought back to live and after using it and the digg reader in parallel for a while, i’ve now more or less settled on using the former.

geography training with HTML5

educational,geeky,webdesign — 1. Mar 2012

i was having some fun with native ‘drag-and-drop’ in HTML5. as an exercise for myself, i designed a geography training app. head over to map-o-matic to check it out. needless to say, this works only with the latest generation of web browsers (works on firefox/chrome, doesn’t work on IE).

i mostly followed a tutorial over at html5rocks.com.

this was also my first time to translate a PHP webapp with gettext/poedit. pretty nifty :)

‘locate’ on ubuntu with an encrypted home dir

free software,geeky,linux,ubuntu — 5. Oct 2011

recently, i noticed that the locate command on my ubuntu system didn’t work as expected. it simply didn’t list files located in my $HOME dir, while it did still list files in the system directories. it took me a while to figure out that this behaviour was due to the fact that i decided to check the “encrypt home dir” option when i last (re-)installed the OS.

on second thought, it makes sense that it works that way, since the command to update locate‘s database (updatedb.mlocate on ubuntu) is run as a root cronjob, and as such it can’t access the filesystem while it’s encrypted. on the other hand, understanding this requires quite a bit of prior knowledge about how locate works, and i think it’s a bit rough to let the users figure this out all by themselves, without as much as a warning. the situation would be much improved if locate would at least spit out a warning that it can’t access the home dir, instead of the ominous silence, from which we usually conclude that no matching files exist on the disk.

after some googling, i found a good solution for this problem. this guide explains how to set up locate to store a separate, user-specific database inside the encrypted home directory. this also requires a user-specific cronjob. after following that guide, locate once again works just as expected on my system.

welcome to html5

geeky,webdesign — 13. Feb 2010

note: this will work only on very modern browsers (latest firefox, opera, safari). it will work partially on chrome (no audio), and it will not work at all on internet explorer.

update march 2012: major update! the game logic has been re-written from scratch (it was buggy), the game now has 5 levels, a reset button, fading effects, and even one very simple animation :) in addition, audio should now work on chrome.

matching hard- and software

geeky,linux — 20. May 2009

“If a company designs both hardware and software,
it can build much better systems than if they only design the
software. That’s why Apple’s iPhone is so much better than
Microsoft phones.”

this statement comes from larry ellison, oracle’s CEO, in a recent reuters interview.

what he says is a simple truth, almost trivial, yet it can’t be stressed enough how significant it is. while i personally couldn’t care less about phones, the statement holds of course just as well for desktop / laptop computers. frankly, both windows and linux desktop OSes work crappily on many computers today. you will get devices without proper support, driver issues, incompatibilities between components and all sorts of other problems. and this problem will never go away as long as the hardware and the software are not engineered together. there are literally quintillions of different PC devices / components out there today, and there is just no way any OS could ever support all of them – and all combinations of them – equally well.

the solution, then, is to buy hardware and software that comes from the same company and has been designed to work together. both windows and linux fail in this regard, only apple (and sun) get this right as of today. and this is IMHO the main reason while apple is so successful these days. it’s just not possible to get the same stability and reliability with an OS that is supposed to work “on any PC hardware”.

hopefully, we will get linux computers at some point in the future that are engineered in this way. the company could make money from the hardware, and the software could still be free/open source. i at least would be happy to pay the extra charge.

google’s secret

geeky — 23. Dec 2008

google got all fat & rich because of one single reason: they excel at sorting search engine hits. the secret behind this is called PageRank, a clever algorithm to sort links. they even explain the basics of how that works on their homepage. how nice of them.

but hold on a second. if all of their success was based on one single business secret, why in the world would they be talking so openly about it on their webpage? granted, they don’t give any implementation details there, but still, why put the competitors/cloners on the right track with information about the way PageRank works?

personally, i think it’s a priori much more likely that the information about PageRank on the google homepage is deliberately misleading, so as to fool the competition (and the general public…).

but what else could be google’s secret, then?

well, try this: go to google and search for ‘antoine meillet’ (feel free to use your favourite linguist instead ;). look at the results and hover over the links. you will see nothing strange: the full URL in the status bar will indicate that they are direct links to the target webpage.

or maybe not? check out the source code of the google result page (on firefox, select the text and choose ‘view selection source’ from the right-click menu – god i love that browser). surprise, surprise: the href attribute does in fact not show a direct link to the target (we expect: href=”http://en.wikipedia.org/wiki/Antoine_Meillet”), but instead shows something really cryptic like:

href=”/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fde.wikipedia.org%2Fwiki%2FAntoine_Meillet&ei=IzJQSaP3JY_m0gXV7KWEBA&usg=AFQjCNFHbV4saUM80cY7BQ6pfEYNAGnD0A&sig2=a9sZLl2cPM-V7WC5dGs–A”

now what exactly that means (and how they manage to still show the simple address in the browser’s status page) is a mystery to me. but what seems quite obvious is that people are in fact never directly forwarded to the target site when they click on the google search hit. instead, it looks like they are secretly routed back through google HQ. now since it’s rather difficult to figure out what exactly happens if you click on that link (too much javascript involved…), it’s easier to just log the browser’s activity to see what’s going on behind the scenes. sounds like a job for wireshark (good thing i use linux, where great tools like this one come included!).

here’s wireshark’s list of the all HTTP/GET requests that happend on my network interface when clicking on one of the links from google’s search result:

notice that the first HTTP/GET request went back to google! only the second (and the following ones) went to wikipedia:

with the help of wireshark, we have therefore confirmed that the users are being routed back through google before they reach their actual destination, so the question is: what could be the purpose of this? simple: google will store the key words of your search along with those hits from the search result that you actually clicked on, trusting that you will look through the list of results that google presents and choose the relevant hit(s) from among them. with this information, they will increase the ranking of the hits you clicked on, and decrease the ranking of those hits which you skipped, all in relation to your specific search keywords.

if true, it means that google does in fact let the users do the sorting for them. humans are much better at sorting out relevant hits from among a mass of unrelevant ones, and since google has the possibility to collect that information, why not use it to improve the ranking? it seems not too far fetched, then, to suspect that the core of PageRank is in fact not a fancy algorithm at all – but that it is simply a clever way to let the users rank the search results for them, by (secretly) collection data on which hits the users clicked on and which ones they didn’t.

ps: further investigation showed that the cryptic links in the source of google’s search result pages are not always there. but even in those cases, wireshark shows a HTTP/GET request sent to google before the loading of the actual link target.