Givealinklogo
Register

Download Our Data

We hope that other researchers and Internet users will benefit from our bookmark collection. You can download the most recent snapshot of our database. We protect the anonymity of the users and hide their identity completely.

Instructions

Download the file using this link. You will receive a text file of our donated links and the associated strength based on our network delimited by a tab. Each link/strength pair is separated by an empty line. Each request will retrieve 100 of our links starting from the start_id. So in order to retrieve all of our data, one would initialize the start_id to 1, then to 101, then to 201, so on and so forth. The best way to get a single batch in our opinion is to issue

wget http://givealink.org/rss/all_links?start_id=1
on a UNIX command line. Please sleep at least 3 seconds between requests to limit system load.


One may also download the tags associated with each link. The format is similar to the 'all_links' command. The only difference is the tags delimited by a space for each URL strength pair. e.g.

url [tab] strength [tab] tag [sp] tag [sp] tag [sp]\n.
Each one of these triples is separated by an empty line. Again the best way to access this data in UNIX is
wget http://givealink.org/rss/all_links_and_tags.


Of course, we make our triples freely available as well. Again it will get the data in increments of 100. Here is an example:

wget http://givealink.org/rss/all_links_and_tags_and_users?start_id=1.
Each triple consists of a one way hash of the uniquely identifying a user, a tag, and a URL all delimited by tabs.


Then use another one of our services to download the similarity matrix by sending this feed a URL obtained from the plain text file.