黑客风云——风云网络
设为首页 加入收藏 我要投稿 网站地图

您现在的位置: 黑客风云 >> 黑客文章 >> 网管频道 >> 网站建设 >> 正文
·完美空间提供500M免费AS04-10·企业安全之YY内网准入以04-09
·企业安全之意识与策略04-09·剑走偏锋:IIS漏洞利用04-09
·我来免费网提供100M免费04-09·1122mb.com提供20G超大免04-08
·映像劫持与反劫持技术04-07·让所有"暴力删除工具"无04-07
·入侵88red系统的详细过程04-07·Sql Injection脚本注入终04-07
·vbs+delphi 反弹后门生成04-07·飞讯网提供100MB免费PHP04-07
·突破SQL注入攻击时输入框04-04·结合内核和病毒技术的最04-04
·Real Player rmoc3260.d04-04·亿万网络今月最后为您提04-04
·php+mysql 5 sql inject04-03·Real Player rmoc3260.d04-03
·oblog文件下载漏洞04-03·免费啦提供1G-2G免费全能04-03
·完全解析网页后门和挂马04-02·一句话开3389(只测试过04-02
·萧萧免费空间网提供100M04-02·谷道免费空间网提供1G免04-01
·从本地入手解决双线路由03-31·sablog 1.6 多个跨站漏洞03-31
·富文本编辑器的跨站脚本03-31·Cookie注入是怎样产生的03-31
[推荐]动态网站制作和提交sitemap的方法
        ★★★★★

动态网站制作和提交sitemap的方法

文章整理发布:黑客风云 文章来源:www.05112.com 更新时间:2007-1-16 8:45:01
向google提交sitemap是增加网站对搜索引擎友好度的重要方法。sitemap.xml文件可以引导蜘蛛更好,更快的爬行网站。如果是一个静态网站,利用google提供的sitemap生成器就可以很傻瓜化的制作一个非常成功的sitemap.xml文件。而对于一个动态网站,制作sitemap就需另费一番心思了。
 
  试过的webmaster都知道,如果用google的生成器为一个动态网站生成sitemap.xml文件(使用第一种设置),文件会相当小。这是由于在这种设置下,google记录的只是网站根目录下有具体文件的url,对动态生成的url很少理会。而动态网站最重要的内容就在生成的url中,这种矛盾促使我们:
   如果要提交sitemap,最好是提交一个含全部重要url的sitemap.xml文件。(如果不会制作,还不如不提交,guagua的一个动态网站在没有提交sitemap前,收录量是30000多,后来用它的生成器制作了一个sitemap提交以后,收录量下降到了9000)
 
  制作动态网站的sitemap方法是:首先写一个程序把所有的动态url导入到一个txt文件中,然后利用sitemap生成器的第二种设置(urllist),生成sitemap.xml文件。
下面举个实例:
1、用程序导出所有的url
 重点中的重点:每一行只能有一个url
/*getsitemap.php文件
<?php
include("./include/config.php");
include("./include/con_db.php");
include("./include/function.php");//网站的配置,数据库的链接,使用的函数等
ob_start();
$rr="http://www.websitename.com.au/";
echo $rr."\n";//这是第一个url,也就是主页地址
$id=array(1,2,3,4,6,7,8,10,11,12,13,15,19,20);//这是产品的类别id
for($k=0;$k<14;$k++)
{ $categories_id=$id[$k];
$sql="select distinct categories_id,categories_name from categories where categories_id='$categories_id'";
$result=mysql_query($sql);
while($row=mysql_fetch_array($result))
{
 $cname=$row["categories_name"];
     $rr="http://www.websitename/categories.php/".param_bncode($cname);
  echo $rr."\n";//输出类别,类别页面的内容是属于该类别下的所有品牌(二级url)
 $cid=$row["categories_id"];
 $sql2="select distinct models_brand from models m,models_to_products m2p, products p WHERE m.models_id=m2p.models_id and m2p.products_id=p.products_id and  m.categories_id=$cid";
 $result2=mysql_query($sql2);
 while($row1=mysql_fetch_array($result2))
 {
  $brand=$row1["models_brand"];
  $rr="http://www.websitename/brands.php/".param_bncode($cname)."/".param_encode($brand);
  echo $rr."\n";//输出各类别下的品牌url,页面内容是该品牌的所有机型(三级url)
  $sql3="select distinct models_name from  models m,models_to_products mp,products p where m.models_id = mp.models_id and p.products_id=mp.products_id and m.models_brand='$brand'and m.categories_id=$cid";
  $result3=mysql_query($sql3);
  $brand=param_encode($brand);
  $cname=param_bncode($cname);
  while($row2=mysql_fetch_array($result3))
  {
   $arr[$brand]= param_encode($row2["models_name"]);
   $rr="http://www.websitename/"."models.php/".$cname."/".$brand."/".$arr[$brand];
   echo $rr."\n";//输出某类别某品牌适合该机型的产品,也就是产品页面(四级url)
  }
 }
}}
$str=ob_get_contents();
$file=fopen("urllist.txt","w");//写入到一个,名为urllist.txt的文件中
fwrite($file,$str);
fclose($file);
?>
2、设置你的My_config.xml文件
<?xml version="1.0" encoding="UTF-8"?>
<!--
  sitemap_gen.py example configuration script
  This file specifies a set of sample input parameters for the
  sitemap_gen.py client.
  You should copy this file into "config.xml" and modify it for
  your server.

  ********************************************************* -->

<!-- ** MODIFY **
  The "site" node describes your basic web site.
  Required attributes:
    base_url   - the top-level URL of the site being mapped
    store_into - the webserver path to the desired output file.
                 This should end in '.xml' or '.xml.gz'
                 (the script will create this file)
  Optional attributes:
    verbose    - an integer from 0 (quiet) to 3 (noisy) for
                 how much diagnostic output the script gives
    suppress_search_engine_notify="1"
               - disables notifying search engines about the new map
                 (same as the "testing" command-line argument.)
    default_encoding
               - names a character encoding to use for URLs and
                 file paths.  (Example: "UTF-8")
-->
<site
  base_url="http://www.websitename.com.au"
  store_into="/var/website/go-shop.com.au/sitemap.xml"
  verbose="1"
  >
 
 <url
    href="http://www.websitename.com.au"
    lastmod="2007-01-01"
    changefreq="daily"
    priority="1.0" />
<directory  path="/var/website/websitename/"  url="http://www.websitename.com.au/" />
<urllist path="/var/website/websitename/urllist.txt" encoding="UTF-8" />
<filter action="drop" type="wildcard" pattern="*.css" />
<filter action="drop" type="wildcard" pattern="*.js" />
<filter action="drop" type="wildcard" pattern="*.inc" />
<filter action="drop" type="wildcard" pattern="*.jpg" />
<filter action="drop" type="wildcard" pattern="*/search" />
<filter action="drop" type="wildcard" pattern="*/images" />
  <!-- ** MODIFY or DELETE **
    "sitemap" nodes tell the script to scan other Sitemap files.  This can
    be useful to aggregate the results of multiple runs of this script into
    a single Sitemap.
    Required attributes:
      path       - path to the file
  <sitemap    path="/sitemap.xml" />
  -->
  <!-- ********************************************************
          FILTERS
  Filters specify wild-card patterns that the script compares
  against all URLs it finds.  Filters can be used to exclude
  certain URLs from your Sitemap, for instance if you have
  hidden content that you hope the search engines don't find.
  Filters can be either type="wildcard", which means standard
  path wildcards (* and ?) are used to compare against URLs,
  or type="regexp", which means regular expressions are used
  to compare.
  Filters are applied in the order specified in this file.
  An action="drop" filter causes exclusion of matching URLs.
  An action="pass" filter causes inclusion of matching URLs,
  shortcutting any other later filters that might also match.
  If no filter at all matches a URL, the URL will be included.
  Together you can build up fairly complex rules.
  The default action is "drop".
  The default type is "wildcard".
  You can MODIFY or DELETE these entries as appropriate for
  your site.  However, unlike above, the example entries in
  this section are not contrived and may be useful to you as
  they are.
  ********************************************************* -->
  <!-- Exclude URLs that end with a '~'   (IE: emacs backup files)      -->
  <filter  action="drop"  type="wildcard"  pattern="*~"           />
  <!-- Exclude URLs within UNIX-style hidden files or directories       -->
  <filter  action="drop"  type="regexp"    pattern="/\.[^/]*"     />
</site>
然后你只要启动生成器,就会在网站的根目录下生成一个全新的sitemap.xml文件,把该文件提交给google就 ok了。
文章录入:cainiaowang    责任编辑:cainiaowang 
【字体: 】【发表评论】【加入收藏】【告诉好友】【打印此文】【关闭窗口
VIP 专 区
Copyright @2006 黑客风云 ●业务联系:QQ 联系怪人 联系奇人 Email:给怪人发邮件 给奇人发邮件
ICP备案:冀06009886