Author Topic: Sitemap - Is there any plugin module?  (Read 128 times)

onlineservices

  • Veteran Member
  • *****
  • Posts: 242
  • Karma: +0/-0
Sitemap - Is there any plugin module?
« on: May 24, 2018, 03:50:04 AM »
Is there any sitemap plugin module for Etano ?  When things changes it can help to remove or add new pages  to tell to search engines. Any idea or solutions?

Thank You

maverick

  • Administrator
  • Veteran Member
  • *****
  • Posts: 3108
  • Karma: +210/-7
    • Maverick Webworks
Re: Sitemap - Is there any plugin module?
« Reply #1 on: May 24, 2018, 08:43:52 AM »
No there's no sitemap plugin or module for Etano. An XML based sitemap is considered good practice for getting important pages indexed quicker by Google, but it's not imperative as Google bots will eventually crawl your site every month or so and index any pages that it finds and deems worthy of being indexed. Having a sitemap doesn't guarantee that all pages listed in a sitemap will get indexed.

Creating a sitemap plugin for Etano poses some challenges, of course one will expect and want the sitemap to do all the work for them, such as auto include any new pages and also remove deleted pages. Therefore how to determine which pages get included in the sitemap. Many Etano users only allow access to certain pages unless they are a member and login, Google isn't going to like it if the sitemap is full of links to pages that aren't accessible. Some will allow public access to profile pages and want to include links in the sitemap to all member's profile pages which could create massive sitemaps that could be potentially a tad hard on server resources which won't be good for those on low end or cheap hosting plans.

There are plenty of 3rd party sitemap generators available, some are free and some are paid services, it all depends on what you want and how much control you want on determining what gets included. Everyone's needs are going to vary depending on how they configure access levels in Etano, therefore I would suggest you do some research and find something you feel will suit your needs.

Fusion Responsive Template & Free Mods
http://www.maverickwebworks.com
DO NOT PM me asking for personal help. Post your problem or request in the forums so the entire community can contribute and benefit.

onlineservices

  • Veteran Member
  • *****
  • Posts: 242
  • Karma: +0/-0
Re: Sitemap - Is there any plugin module?
« Reply #2 on: May 25, 2018, 04:38:00 AM »
In general i use https://www.xml-sitemaps.com/ to generate my multiple portals. I generated the site map as usual and submitted it to GOOGLE Search Console (Google webmaster tools) . its about 145 pages with page priority. But it was unfortunate. That sitemap is rejected with an notification saying
Quote
Some important page is blocked by robots.txt.

So when we go through robots.txt it has
Quote
User-Agent: *
Crawl-delay: 10
Allow: /
Disallow: /admin
Disallow: /ajax
Disallow: /events
Disallow: /fckeditor
Disallow: /images
Disallow: /includes
Disallow: /js
Disallow: /media
Disallow: /processors
Disallow: /skins_site
Disallow: /tmp
Disallow: /tools

To get at least basic traffic to user profiles, we need to permit robots up to some safer extend. But it was not possible.

And i am not sure which of the above folders shall be allowed to make user profiles visible to search engines. Because several users recommends it to do so .

But i was able to submit the sitemap when the portal was running with HTTP and recently it is changed to HTTPS (ssl) - i am not sure where is the actual issue.

Thank u

maverick

  • Administrator
  • Veteran Member
  • *****
  • Posts: 3108
  • Karma: +210/-7
    • Maverick Webworks
Re: Sitemap - Is there any plugin module?
« Reply #3 on: May 26, 2018, 03:42:04 AM »
Before creating a robots.txt you must have a clear understanding what it's purpose is and what it's limitations are. The slightest misuse or inclusions of incorrect directives can prevent important pages from being indexed and also cause harm to your site rankings. If you're unsure what you're doing you'd be better off not even having a robots.txt file to ensure your site gets properly crawled and indexed.

The main thing to understand is robots.txt files are just directives and suggestions to crawlers on what pages you prefer not to be indexed, a robots.txt file is not intended as a means for securing your site. The use of the terms "allow" and "disallow" in a robots.txt file are purely advisory and relies on the compliance of the crawler, some bots don't even bother honoring robots.txt files and will crawl your entire site regardless if you have a robots.txt file or not.

Here is a good beginners guide to understanding the robots.txt file - https://varvy.com/robottxt.html

Disallowing the crawling of JS or CSS files in your robots.txt file is one example that can actually harm how well search engines such as Google render and index your content and can result in less than optimal rankings. Google must have access to these resources in order to fully understand your pages. There's no point in disallowing or hiding JS or CSS files as they can be accessed by any modern browser by "viewing page source" or by using a browser's "inspect element" feature.
https://varvy.com/spiderview.html

Disallowing the crawling of the "skins_site" folder could also cause potential issues as this folder contains much of the html content of your pages. Even disallowing images that are essential parts of your site such as your logo and banners can harm your page rankings.

You can disallow access to your admin folder if you want, but since it already requires a password to login and gain access it's rather redundant and will generally be ignored and not indexed by Google. Note - a robots.txt file will NOT help secure your site or it's files, it’s better to use other blocking methods, such as password-protecting the admin folder on your server.

You should always test your robots.txt file to ensure it follows Google's guidelines and is not blocking any important pages by using one or both of the following resources and tools:
https://support.google.com/webmasters/answer/6062598?hl=en
https://varvy.com

All I can provide are some basic suggestions, I can't guide people on exactly what they should or should not include in their robot.txt files or sitemaps, these are personal choices typically based on how your site's access levels are configured and is something you need to research and try to figure out on your own, either that or seek out an SEO expert to assist you.

Fusion Responsive Template & Free Mods
http://www.maverickwebworks.com
DO NOT PM me asking for personal help. Post your problem or request in the forums so the entire community can contribute and benefit.