UK Pub History in London and the Home counties

The UK Pub History for London and the South are listed on my pub history site if they have any historical interest

Search ME by surname, address or pub name

A simple google sitemap generator tool - using simple linux commands

There are a number of excellent sites which list the pubs in London and their history, as listed in the links at the side.

The best is probably here, and largely lists pubs geographically using church parishes. It may sound weird in sorting them in this fashion, but you will note that churches rarely move; and they also record the parish records for that geographical area.

To get your sites recognised in google and other search engines, there are a number of useful tips.

I am #1 for pub history, and one of the reasons for this is that many sites I have are relevantly named pubshistory.com or pubhistory.co.uk etc.

A tremendously useful service which google offers is the webmaster tools which is an online service to run some stats on how well the site is running, and whether there are any major issues with missing pages and links.

The first step with this service after adding a site name is to add a simple verification html page to the site.

The other majorly important service they recommend is to submit a sitemap of all of the pages on the site. This just needs to a be a simple text file of the html pages on the site. The quickest way is just to type these one per line, but when you have in excess of 20 or 30 thousand pages, it is quicker to automate this task.

This only works in linux, so don't bother reading further if you have a windows server. BUT, if you have a local site listing, and a Ubuntu live disk, you could probably also run this locally.

So, what do you need to do.

Step 1: login to the linux site you want to list.

I usually use the cd command to get to the relevant top level directory for the site, e.g.

/var/www/vhosts/southernpub.co.uk/httpdocs

In this directory run the find command on the `pwd` and output this to a random file name, i.e.:

find `pwd` > output

If you read this file the entries will look like this:

/var/www/vhosts/southernpub.co.uk/httpdocs/Devon/Bickington
/var/www/vhosts/southernpub.co.uk/httpdocs/Devon/Bickington/index.shtml

All I am interested in is the html files, so the next step is to then read this file and just keep those with html in the name

cat output | grep html > outputhtml.txt

/var/www/vhosts/southernpub.co.uk/httpdocs/Devon/Bickington/index.shtml

Last trick is to change this to a valid html page link

There are simple ways of using replace in linux, but I just open it in textpad and run a simple replace command to remove the httpdocs and also put an http:// at the start of the line and we now have

http://southernpub.co.uk/Devon/Bickington/index.shtml which is one of the many thousands of page links in this file.

This outputhtml.txt now needs to be uploaded onto the relevant site and tell google about it.

So, lastly we go back to the google webmaster tools site and upload the new sitemap file. And it does the rest.

A simple google sitemap generator tool from your own home.

 

 

 


And Last updated on: Sunday, 26-Feb-2017 14:26:18 GMT