As previously mentioned, I have a grand plan to do some funky web2.0 style stuff on this web site. However, with the new Yahoo Pipes service, maybe I don’t need the space after all. The premise behind Pipes is to bring the power of Unix pipes and filters to the world of feeds, allowing the mashup of multiple feeds into one. The service supports some pretty powerful bit of functionality (or modules as they call them) to perform analysis and processing of the feeds. There’s some good posts over on the O’Reilly Radar about how Pipes work, and one from a rather excited Tim O’Reilly on why he thinks Pipes is a “milestone in internet history”.
Firstly, I’m going to talk about things in this post for which I am no means an expert and very much a beginner, so please don’t take any of this as gospel.
One of the biggest problems with the move from my wordpress.com hosted blog to hosting on my own site is managing to redirect people to the new blog, and generating new traffic to the hosted site.
The old blog was obviously picked up by Google, other search engines, blog directories like Technorati, and other people’s blogrolls. Thus, any redirection strategy would have to:
- Attract new visitors to the new blog location
- Get people searching for popular post topics to go to the new site
- Try and get people to comment on the new site rather than the old
These aims can be achieved by:
- Making it obvious that the old blog is dead
- Getting the new blog to appear in Search Engine Results Pages (SERPS)
- Getting the old blog to disappear from SERPS!
- Getting blog directories and blogrolls updated
The most obvious thing to do first was to add a post to the old blog which pointed people to the new one, making it clear that they needed to update bookmarks and feeds. I also pared down the template to remove categories, archives, calendar etc. In fact, all that is left is the blogroll. I also:
- Added a text widget to the sidebar with a link to the new blog which would appear on every page of the blog
- Changed the title of the blog to make it a bit more obvious what was going on
I considered disabling comments on each post and perhaps even adding a redirect comment to each post, but this appeared too much work to be worthwhile. So far I’ve only had one new comment on any of the old posts (but also no new comments on their equivalents on the new blog)
I’ll cover adding the new blog to search engines later on. However, getting the old one removed is a difficult thing for a wordpress.com hosted blog as you have little control over it. On a normal web site you could add an .htaccess file to the root directory with a 301 redirect. Not so when you don’t have sufficient access to do so! There is an option in the dashboard under Options -> Privacy to restrict access to the site to search engines and remove it from WordPress’ own listings. I suspect this does a similar job to a robots.txt. A robots.txt is the traditional way to restrict search engine spiders from indexing your site. Google also have a tool for removing specific URLs, but this is not meant for removing an entire blog for instance.
I then updated any links I have to my blog on other site profiles, created a new Technorati blog profile and fired off emails to a couple of specific IBMer blog rolls.
The final step was of course to get people’s blogrolls updated. One or two picked up on the change by themselves, but a quick email around the rest sorted out the majority of the rest. What you can’t help are links from comments you’ve made on other people’s blogs.
Up until today I’ve not seen traffic to the old site decrease at all. However it looks like my most popular post on the BT Homehub which normally attracted 60 or so views/day has disappeared from Google, and as such it has not received a single hit so far today. Thus it looks like things are beginning to disappear.
Eventually once the new blog is indexed and appearing in SERPS and I’m happy traffic to the old one has reduced to a trickle, it will be time to hit the delete button.
This is a whole different kettle of fish in the self-hosting world compared to wordpress.com. When I created the original blog it appeared on Google and co. in a couple of days. Not so now! As hinted at above, I am a complete beginner in the world of Search Engine Optimization (SEO) so use these comments as hints about what to go and research more thoroughly!
To begin with I set up an index.html (more on that in a separate post) linking to the blog, then went to Google and added the URL. This then led me to find the Google webmaster tools. This led me to research the whole area of robots.txt, sitemaps and the pro’s and cons of www versus non-www, and all sorts of issues around how search engines look out for duplicate content and the effect it can have on your results. The last item is especially relevant to blogs as the same content often appears on different distinct URLs (archives, categories, posts)
Along the way, I found and installed a couple of useful WordPress plugins:
- DupPrevent – Allows you to configure areas of your blog which won’t be visible to spiders to avoid a duplicate entry penalty.
- Google sitemaps plugin – Generates a sitemap from your blog content. Highly customizable and can include non-blog URLs.
The most disappointing aspect of my research however is the fact that it can take four weeks or so for new sites to appear in search engine results.
For months now, a simple black facade with an Apple logo has hidden one of the shops in Southampton’s West Quay shopping mall. Each time I passed I wondered what was going on inside, and when the store would open. Yesterday Apple emailed out to say that the opening will be at 9am on Saturday 10th February. The first 1000 people through the door get a free t-shirt. I intend to be there with DSLR in tow to capture the events, and maybe pick up a couple of bits for the iMac!
More details on the store over at Apple.
I always though this would happen. Steve Jobs has posted what amounts to an open letter to the music industry to stop the use of Digital Rights Management for online music downloads. The thoughts contained within it are clear for all to see. Apple Inc. would switch the iTunes Music Store over to a non-DRM format “in a heartbeat” if the big-four music companies would allow it. Jobs elucidates the options they have today: to stay as they are and watch a market fragment into proprietary formats, license FairPlay and watch it get compromised quicker than the blink of an eyelid, or convince the music industry that DRM has never, and will never work.
Interestingly, Jobs effectively issues a call to arms to the citizens of Europe to put pressure on the big four, both because we have been most vocal in criticizing DRM and as the music industry is effectively centered here. Where do I sign up?
Last Saturday a good number of Hursley folk met up in Winchester for a two hour photography challenge. The brief was simply to take photos, meet up and have a bit of fun. I, along with most people seemed to gravitate towards the cathedral, though I ventured onto the High Street later on to capture some street and candid shots. Each person then puts forward three of their shots for the competition, though that’s competition in the loosest sense of the word. In any case, here are the three shots I selected:
Winchester Cathedral HDR:
I’m proud of the first one as the burning of the clouds in PSE really made it. The second one is a composite of 9 separate exposures bought together using FDRTools. The third is something a bit different from the last two, but not a bad shot all the same despite the camera shake.
This is probably the most interesting area of the whole move, and the one I’ve spent most time getting right.
One of the things I did like about wordpress.com was the statistics it showed you under the dashboard. They were quite comprehensive and also addictive. The feed stats on the other hand were not so good. I experimented with using FeedBurner, but as you have no control over the default feeds wordpress.com generates all you can do is add a link on the blog pages and hope people use it.
It is quite a shock when the stats tab disappears once you move to a self-hosted wordpress.org blog. As such, the general way people compensate is to use two external pieces of technology to provide much the same (and more) information:
I’m not going to cover the process of obtaining a FeebBurner feed as that is well known and documented. However it is worth covering what you have to do to the blog to make sure people end up using it.
The first step is to use a plugin. I installed the FeedBurner Feed Replacement plugin. This will substitute the standard wordpress feed for your feedburner one whenever anybody clicks on the feed icon in the browser, or on the feeds under the Meta section.
However, there are still two possible ways that people can get to the original wordpress feeds. Firstly, they can copy and paste the links under the Meta section into their feed reader. The feed links that appear here are controlled by the PHP get_bloginfo() function. To modify what gets displayed, I went into the Theme Editor section under Presentation and edited the Sidebar theme file which originally looked like:
<?php wp_register(); ?>
<?php wp_loginout(); ?>
<a href="feed:<?php bloginfo('rss2_url'); ?>" title="<?php _e('Syndicate this site using RSS'); ?>"><?php _e('<abbr title="Really Simple Syndication">RSS</abbr>'); ?></a>
<a href="feed:<?php bloginfo('comments_rss2_url'); ?>" title="<?php _e('The latest comments to all posts in RSS'); ?>"><?php _e('Comments <abbr title="Really Simple Syndication">RSS</abbr>'); ?></a>
<a href="http://validator.w3.org/check/referer" title="<?php _e('This page validates as XHTML 1.0 Transitional'); ?>"><?php _e('Valid <abbr title="eXtensible HyperText Markup Language">XHTML</abbr>'); ?></a>
<?php wp_meta(); ?>
The interesting lines here are the ones that contain calls to get_bloginfo such as:
feed:<?php bloginfo('rss2_url'); ?>
I simply modified these lines to hardcode in the FeedBurner URLs for my main and comment feeds. However, if you are using widgets, then this does not affect the Meta widget and I’ve not yet tried to work out what needs to happen there.
The final way in which people may get to the original feeds instead of your Feedburner ones is via a feature called feed autodiscovery. This allows people to put in the site hostname (e.g. just http://www.adrianspender.com) into their feed reader and the reader then goes off to discover the correct feed URL. This will return the wordpress feed by default. To modify this, I went into my theme and edited the Header theme file in much the same way as for the Meta section. The difference this time is that there are three hrefs, each for different flavours of RSS and Atom feeds for the main site, not for comments. Simply replace the get_bloginfo() calls as before with your FeedBurner feed URL. I suggest you ensure your FeedBurner feed has the SmartFeed capability enabled.
Providing a dashboard stats view
Finally, there is a plugin called WordPress Reports which can collate your GA and FB stats into a single view within the wordpress admin, not too dissimilar to that provided by wordpress.com. It doesn’t look anywhere as good, nor give you as much detail, but may be useful nonetheless.
I’ll cover the customisation I did to the new blog in the context of two areas: templates/themes and plugins/widgets.
In wordpress.org, adding a new template is simply a case of finding one, downloading it, then uploading it to the wp-content/themes directory. It is then immediately available under the Presentation option in admin.
There’s not really much else to say other than I went here to look for themes. And here to read up on them. Of course, there are many, many more available than you get on wordpress.com and you have full control of their customisation without having to pay to edit the CSS! Look out for themes that support widgets if you are used to using them.
Widgets are pretty big on wordpress.com blogs, and the good news is that they are supported on wordpress.org as well, that is as long as the theme you are using supports them. There are also a lot more available. There is a list of widget aware themes and third party widgets here.
Widgets are delivered as plugins which are simply uploaded into the wp-content/plugins directory. You then have to go to the plugins tab in admin to activate them. You will want to activate the Sidebar Widgets plugin whilst you are there.
You should then see the Sidebar widgets option under the presentation tab as you are used to in wordpress.com.
It is worth noting that the Askimet comment spam plugin is available as part of the 2.1 package. However to enable it you need a wordpress.com API key. As you are reading this you are probably moving from wordpress.com so handily you already have one. Just go to Users then Your Profile on your old blog and you should see the API key listed. There’s also a Google seach and del.icio.us widget present by default.
The only other plugins/widgets I’ve installed so far are covered in the next post about feeds and stats.
This post covers the my experiences of data migration from wordpress.com to a self-hosted wordpress.org blog.
WordPress supports a customised RSS based XML format known as WordPress eXtended RSS (WXR) for exporting and importing content from wordpress blogs. It also supports import from a number of other blog platforms, as covered in great detail on the wordpress codex entry.
The WXR file will contain all your posts, comments, categories and custom fields. It won’t contain your blogroll.
On the wordpress.com blog the manage option on the dashboard has an export option. From here you can generate the WXR file for import into the new blog. However, before you do, be sure to go in and delete any spam comments that Askimet has hanging around. There’s no point in transferring them and doing so just increases the size of the WXR file. I forgot to do this first time around, and ran into a PHP error which I suspect was caused by excessive attempts to allocate memory when importing. Re-exporting without the spam comments solved it.
Importing is just a case of going to the admin, manage, import option on the new blog and browsing to the WXR file.
My posts on the old blog were all made by a user with the name “Adrian Spender” and userid aspender. On the new blog the userid was admin. On importing this didn’t appear to cause any problems. However it did mean that new posts appeared to be from a different user to the migrated ones. To solve this I created a new user with the same name (but as it happens a different userid) I’d suggest that before importing you create the same userid/user name user on the new blog as was on the old, and give them administrator access. I suspect this might have got around the issue I then hit.
Looking at the imported entries, and comments or pingbacks I’d made had my name linked back to the old .com blog. I had to go through and edit each comment to update the link to the new blog address to avoid inadvertantly redirecting people back to the old blog if they happened to read old comments.
The final step was to migrate over the blogroll. WordPress has a nice facility to import a blogroll from an OPML file, but unfortunately there doesn’t appear to be the equivalent export capability (at least not on wordpress.com) Thus, and as I only had a short blogroll I did it manually. If all your blogroll links happen to be in del.icio.us for example then you could create an OPML from there.
I hit the same PHP memory error when the default wordpress.org template tried to display any blogroll entry. Initially I thought this might be because I had too many, but it then happened when I pared my blogroll down to just one entry. I got around it through the subject of my next post, namely changing the template.
I was planning to write this anyway, but Andy has prompted me to do it whilst the memory is fresh. Probably a good job as it was so easy I never had to write anything down. I’ll attempt to document it in chronological order and bring up any issues I faced as well as resources I referred to. I’ll cover the following topics in a series of posts:
- Hosting and blog install
- Feeds and stats
- Diversion strategy
- Summary / things to do
Choosing a hosting provider
Well this wasn’t too tough. I’d heard a lot of good things about Register1.net and their Technical Director happens to frequent an online forum I visit. The only option was which package to go for. I decided to go for the VDS Pro account because of the bundled in extra features including WordPress, but I’ll come onto why that was probably an unnecessary expense.
I also registered new domains with them at the same time. This simplified and sped up the process dramatically as there were no transfers to deal with. My domains and hosting were provisioned well within an hour of ordering. I decided to bite the bullet and get .com, .net, .org, .co.uk in one go as it is common for the others to get registered by others if the original domain gets so much as noticed. In fact I also grabbed aidyspender.com, co.uk, .net and .org as well, but two of those were already with another provider and need to be moved once they are about to expire.
Access to hosting and facilities.
With my hosting, no shell access via ssh or telnet is allowed for security reasons, leaving a web interface or more usefully ftp as the main options for getting content onto the site. The VDS is hosted on Redhat Enterprise Linux, and is managed through a web interface. It is running Apache 2.0, POP3/IMAP/sendmail/virus/spam, squirrelmail, PHP 4.3.9, Perl, MyPHP 4.1.18, phpMyAdmin, analog and webalizer.
As mentioned, the package comes with a 1-click installation option for wordpress.org. However on inspection it was fairly back-level (126.96.36.199) compared to the current version 2.1. On reading up about the requirements I discovered that my levels of PHP and MySQL were fine for 2.1.
I then read the installation instructions, and decided it didn’t seem to much hassle to install 2.1 myself. I followed the detailed instructions which involve:
- Downloading and extracting the package locally.
- Create the database. The installation documents how to do this through phpMyAdmin. However my host doesn’t give me access to create databases explicitly through phpMyAdmin. It has to be done through their hosting web interface (I guess so they can impose limits) but this was straightforward enough. You don’t need to create any tables, or go near any SQL. You need to make a note of the database name, userid, password and hostname (usually localhost)
- The next step is to edit one file in the wordpress package. wp-config.php has a section which needs to be updated to reflect the database settings.
- Upload the files via ftp. As it is all PHP it just needs uploading to the server like any web page. No shell access is required to install. The only decision to make is where to install it. I chose to put it under the /blog directory instead of in the root of my web space so I could maintain my own index page. However wordpress itself can give you the option to have a static index page which it manages within its own CMS. I’m still happy I went with the subdir though.
- Run the install script, simply by loading it in your browser. It takes less than 30 seconds and asks you for the name of the blog and your email address. After that your blog is up and running!
So now I had a blog with the default theme, and the usual wordpress admin via /wp-admin. One thing that is noticeably missing is the blue header bar you get on wordpress.com with a link to the dashboard. You suddenly realise why the meta section of the template has a Site Admin entry! The admin is missing a few things you are used to on .com, notably widgets, blog stats and a few other things, but I’ll get to all that in another post. At this point the shell of my new blog was up and running.
I’m using the iScrobbler client on OS X rather than the default last.fm one as it purports to include the Last Played playlist from iTunes in its scrobblings. This is useful seen as all my music is either on the iMac, or my iPod. Hence you won’t see any updates whilst at work for example until i dock the iPod.
The range of music won’t exactly be representative of my tastes either as I’m currently in the process of re-ripping my quite substantial music collection to a lossless format (FLAC to be precise.) I think I’ll have to go through and prioritise the order they are done in. Already I am pleasantly surprised by the music Andy and Roo seem to listen to. Especially to find out that Roo is a fellow Reindeer Section fan.
Edited to add: Just discovered that iScrobbler has an option to scrobble any shared music I play. That means anything I play from my Linksys NSLU2 NAS device running mt-daapd gets logged as well. This means I can withdraw my comment about not having my full music library at my disposal!