Migrate from Tumblr to Github Pages
[technology
blog
go
golang
]
Recently, I’ve decided to move away from Tumblr. I choose Github Pages, mostly because Jekyll.
Although Jekyll provides a tool to import content from Tumblr, I decided to implement one on my own using Go.
For starters, Tumblr expose a simple API endpoint to extract content for a given user, supporting pagination and filtering. In its most simple use, you just need to GET
from https://[username].tumblr.com/api/read/
to get an XML with the aggregated posts. In go, we can retrieve a resource with the http
library
response, err := http.Get(url)
Where url
is a type string
. We can hardcore the url, and it would be fine, but we can also leverage the net/url
framework to build our request.
func produceURL() string {
params := map[string]string{
"num": "50",
"type": "text",
"filter": "none",
}
return getURL("https://iamvolonbolon.tumblr.com/", "api/read/", params)
}
func getURL(base string, path string, params map[string]string) string {
b, e := url.Parse(base) // Let's insatiate the URL
if e != nil {
return ""
}
b.Path += path // Add the corresponding path
p := url.Values{} // And transform our slice of strings into an encoded set of params
for k, v := range params {
p.Add(k, v)
}
b.RawQuery = p.Encode()
return b.String()
}
Now, that we have our url appropriately compose, we can call the API.
func getXMLFrom(url string) ([]byte, error) {
r, e := http.Get(url)
if e != nil {
return []byte{}, fmt.Errorf("Error retrieving %v", url)
}
defer r.Body.Close()
if r.StatusCode != http.StatusOK { // Checking if the response code is 200
return []byte{}, fmt.Errorf("Status error: %v", r.StatusCode)
}
d, pe := ioutil.ReadAll(r.Body)
if pe != nil {
return []byte{}, fmt.Errorf("Error processing response: %v", pe)
}
return d, nil // Returns a byte slice with the response body
}
In order to parse the response xml into native data types, we need to define a set of structs. The xml structure is documented
<tumblr version="1.0">
<tumblelog ... >
...
</tumblelog>
<posts>
<post type="regular" ... >
<regular-title>...</regular-title>
<regular-body>...</regular-body>
</post>
...
</posts>
</tumblr>
We will need three structs
type post struct {
DateGmt string `xml:"date-gmt,attr"`
Date string `xml:"date,attr"`
Slug string `xml:"slug,attr"`
RegularTitle string `xml:"regular-title"`
RegularBody string `xml:"regular-body"`
Tag []string `xml:"tag"`
}
type posts struct {
Post []post `xml:"post"`
}
type tumblr struct {
Posts posts `xml:"posts"`
}
Once defined the structs, encoding/xml
can handle the parsing:
var d tumblr
xml.Unmarshal(xmlBytes, &d)
I’ve defined a couple of methods to format the post
structure as required by Jekyll.
func (p post) formatHeader() string {
str := "---\n"
str += "layout: post\n"
str += "title: \"" + p.RegularTitle + "\"\n"
str += "date: " + p.DateGmt + "\n"
str += "tags: " + strings.Join(p.Tag, " ") + "\n"
str += "---\n"
return str
}
func (p post) formatPost() string {
header := p.formatHeader()
body := p.RegularBody
str := header + "\n" + body
return str
}
Now, it is just a matter of looping through the []post
slice.
for i := 0; i < len(d.Posts.Post); i++ {
post := d.Posts.Post[i]
date := strings.Split(post.DateGmt, " ")[0]
fn := date + "-" + post.Slug + ".markdown"
p := post.formatPost()
err := ioutil.WriteFile(fn, []byte(p), 0644)
if err != nil {
os.Exit(2)
}
}