How to Export Your Mozilla Firefox History as a Dataframe in R

The goal of this post is to export a Mozilla Firefox Browser history and import in R as a dataframe.

Browser history data

Firefox saves your browsing history in a file called places.sqlite. This file contains several tables, like bookmarks, favicons or the history.

To get a dataframe with visited websites, you need two tables from the sqlite file:

  1. moz_historyvisits: it contains all websites you visited with time and date. All websites have an id instead of a readable URL.
  2. moz_places: it contains the translation of the websites id and its actual URL.

More on the database schema:

Import the data into R

sqlite files can be imported with the package RSQLite.

First, find the places.sqlite on your computer. You can get the path, by visiting about:support in Firefox and looking for the Profiles directory.

library(RSQLite)
library(purrr)
library(here)

# connect to database
con <- dbConnect(drv = RSQLite::SQLite(), 
                 dbname = "path/to/places.sqlite",
                 bigint="character")

# get all tables
tables <- dbListTables(con)

# remove internal tables
tables <- tables[tables != "sqlite_sequence"]
# create a list of dataframes
list_of_df <- purrr::map(tables, ~{
  dbGetQuery(conn = con, statement=paste0("SELECT * FROM '", .x, "'"))
})
# get the list of dataframes some names
names(list_of_df) <- tables

Extract browser history

Next, we extract the two tables with the information we need, join them and keep only the visited url, the time and the URL id.

There are two caveats:

  1. The timestamps are saved in the PRTime format, which is basically an unix timestamp and you have to convert it in a human-readable format
  2. Extract the domain of a URL using the urltools package, e.g. getting twitter.com instead of twitter.com/cutterkom
library(urltools)
# get the two dataframes 
history <- list_of_df[["moz_historyvisits"]]
urls <- list_of_df[["moz_places"]]

df <- left_join(history, urls, by = c("place_id" = "id")) %>% 
  select(place_id, url, visit_date) %>% 
  # convert the unix timestamp
  mutate(date = as.POSIXct(as.numeric(visit_date)/1000000, origin = '1970-01-01', tz = 'GMT'),
  # extract the domains from the URL, e.g. `twitter.com` instead of `twitter.com/cutterkom`
         domain = str_remove(urltools::domain(url), "www\\."))

The R package to Create Generative Art


Do you want to create #generativeart with #rstats? I made a package for this purpose. It is called generativeart and you can find it on Github.

You can find more images on my Instagram account @cutterkom.

Description

One overly simple but useful definition is that generative art is art programmed using a computer that intentionally introduces randomness as part of its creation process.
Why Love Generative Art? – Artnome

The R package generativeart let’s you create images based on many thousand points. The position of every single point is calculated by a formula, which has random parameters. Because of the random numbers, every image looks different.

In order to make an image reproducible, generative art implements a log file that saves the file_name, the seed and the formula.

The R package to Create Generative Art weiterlesen

Generative Art: How thousands of points can form beautiful images

These images are based on simple points. This post explains how it works.

1. Step: Point, points, points …

The starting point is a rectangle a grid that is populated with many thousand points, in this case 3,969.

The retangle is placed in a coordinate system, so every point has two coordinates (x, y).

2. Step:

Now, the position of every single points is transformed. This new position is calculated by a formula, which has random parameters. Because of the random numbers, every image looks different.

For example, using a combination of sine, cosine and the random factor:

Circle resembling shapes are created by using a polar coordinate system:

Do it yourself

I wrote a package called generativeart which helps to create those kind of images with R.

You can get the package on Github.