代码语言
.
CSharp
.
JS
Java
Asp.Net
C
MSSQL
PHP
Css
PLSQL
Python
Shell
EBS
ASP
Perl
ObjC
VB.Net
VBS
MYSQL
GO
Delphi
AS
DB2
Domino
Rails
ActionScript
Scala
代码分类
文件
系统
字符串
数据库
网络相关
图形/GUI
多媒体
算法
游戏
Jquery
Extjs
Android
HTML5
菜单
网页交互
WinForm
控件
企业应用
安全与加密
脚本/批处理
开放平台
其它
【
CSharp
】
抓去网站上的信息插入到数据库
作者:
酷&酷
/ 发布于
2011/1/3
/
508
<div>private void button1_Click(object sender, EventArgs e) { dt.Rows.Clear(); //要抓取的URL地址 string Url = "<a href="http://list.mp3.baidu.com/topso/mp3topsong.html?id=1#top2">http://list.mp3.baidu.com/topso/mp3topsong.html?id=1#top2</a>";</div> <div> //得到指定Url的源码 string strWebContent = GetWebContent(Url);</div> <div> // richTextBox1.Text = strWebContent; //取出和数据有关的那段源码 int iBodyStart = strWebContent.IndexOf("<body", 0); int iStart = strWebContent.IndexOf("歌曲TOP500", iBodyStart); int iTableStart = strWebContent.IndexOf("<table", iStart); int iTableEnd = strWebContent.IndexOf("</table>", iTableStart); string strWeb = strWebContent.Substring(iTableStart, iTableEnd - iTableStart + 8);</div> <div> //生成HtmlDocument WebBrowser webb = new WebBrowser(); webb.Navigate("about:blank"); HtmlDocument htmldoc = webb.Document.OpenNew(true); htmldoc.Write(strWeb); HtmlElementCollection htmlTR = htmldoc.GetElementsByTagName("TR");</div> <div> // DataRow row = null; foreach (HtmlElement tr in htmlTR) { string strID = tr.GetElementsByTagName("TD")[0].InnerText; string[] info = tr.GetElementsByTagName("TD")[1].InnerText.Split('('); string strName = ""; string strSinger = ""; if (info.Length != 2) { strName = info[0]; strSinger = "未知"; } else { strName = info[0]; strSinger = info[1].Replace(")", ""); } strID = strID.Replace(".", ""); DataRow row = dt.NewRow(); row["ID"] = strID; row["voide"] = strName; row["name"] = strSinger; dt.Rows.Add(row);</div> <div> string strID1 = tr.GetElementsByTagName("TD")[2].InnerText; string[] info1 = tr.GetElementsByTagName("TD")[3].InnerText.Split('('); string strName1 = ""; string strSinger1 = ""; if (info1.Length != 2) { strName1 = info1[0]; strSinger1 = "未知"; } else { strName1 = info1[0]; strSinger1 = info1[1].Replace(")", ""); } strID1 = strID1.Replace(".", ""); DataRow row1 = dt.NewRow(); row1["ID"] = strID1; row1["voide"] = strName1; row1["name"] = strSinger1; dt.Rows.Add(row1);</div> <div></div> <div> //string strID2 = tr.GetElementsByTagName("TD")[3].InnerText;</div> <div> //string[] info2 = tr.GetElementsByTagName("TD")[4].InnerText.Split('('); //string strName2 = ""; //string strSinger2 = ""; //if (info1.Length != 2) //{ // strName2 = info1[0]; // strSinger2 = "未知"; //} //else //{ // strName2 = info1[0]; // strSinger2 = info1[1].Replace(")", ""); //} //strID1 = strID2.Replace(".", ""); //DataRow row2 = dt.NewRow(); //row2["ID"] = strID2; //row2["voide"] = strName2; //row2["name"] = strSinger2; //dt.Rows.Add(row2);</div> <div> } ////插入数据库 //// InsertData(dt);</div> <div> dataGridView1.DataSource = dt.DefaultView;</div> <div></div> <div> } private string GetWebContent(string Url) { string strResult = ""; try {</div> <div> //创建访问目标 HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url); //声明一个HttpWebRequest请求 request.Timeout = 30000; //设置连接超时时间 request.Headers.Set("Pragma", "no-cache");</div> <div> //得到回应 HttpWebResponse response = (HttpWebResponse)request.GetResponse();</div> <div> //得到数据流 Stream streamReceive = response.GetResponseStream();</div> <div> //对获取到的数据流进行编码解析,让我们可以进行正常读取 Encoding encoding = Encoding.GetEncoding("GB2312"); StreamReader streamReader = new StreamReader(streamReceive, encoding);</div> <div> //读取出数据流中的信息 strResult = streamReader.ReadToEnd();</div> <div> //关闭流</div> <div> streamReader.Close();</div> <div> //关闭网络响应流</div> <div> response.Close(); } catch { MessageBox.Show("出错"); } return strResult; } static DataTable dt = new DataTable();</div> <div> private void Form1_Load(object sender, EventArgs e) { DataColumn ID = new DataColumn(); DataColumn voideName = new DataColumn(); DataColumn name = new DataColumn(); ID.ColumnName = "ID"; voideName.ColumnName = "voide"; name.ColumnName = "name"; dt.Columns.Add(ID); dt.Columns.Add(voideName); dt.Columns.Add(name); } <div></div>
试试其它关键字
抓取
同语言下
.
文件IO 操作类库
.
Check图片类型[JPEG(.jpg 、.jpeg),TIF,GIF,BMP,PNG,P
.
机器名和IP取得(IPV4 IPV6)
.
Tiff转换Bitmap
.
linqHelper
.
MadieHelper.cs
.
RegHelper.cs
.
如果关闭一个窗体后激活另一个窗体的事件或方法
.
创建日志通用类
.
串口辅助开发类
可能有用的
.
C#实现的html内容截取
.
List 切割成几份 工具类
.
SQL查询 多列合并成一行用逗号隔开
.
一行一行读取txt的内容
.
C#动态修改文件夹名称(FSO实现,不移动文件)
.
c# 移动文件或文件夹
.
c#图片添加水印
.
Java PDF转换成图片并输出给前台展示
.
网站后台修改图片尺寸代码
.
处理大图片在缩略图时的展示
酷&酷
贡献的其它代码
(
3
)
.
抓去网站上的信息插入到数据库
.
利用Timer和Global.asax实现定时执行程序C#
.
Python文件操作类
Copyright © 2004 - 2024 dezai.cn. All Rights Reserved
站长博客
粤ICP备13059550号-3