目 录CONTENT

文章目录

Python pandas处理csv数据,提取同类汇总

Administrator
2024-05-13 / 0 评论 / 0 点赞 / 19 阅读 / 0 字

pandas处理csv数据,提取同种类列汇总。

例如你现在有很多分.csv数据(100份),而且每份的数据格式都一样,

host

ip

title

domain

region

link

product_category

lastupdatetime

www.qiushipharma.com

47.96.115.191

二氢青蒿酸_香叶木素原料_磷酸替米考星现货供应_南京秋石医药科技有限公司

qiushipharma.com

Zhejiang

http://www.qiushipharma.com

服务,脚本语言,中间件,脚本语言

######

www.iypxedu.com

38.55.38.128

天阿萨大大撒旦

iypxedu.com

California

http://www.iypxedu.com

其他企业应用,服务,中间件,脚本语言

######

www.nmzhjtwx.com

183.61.241.31

中寰交通网校

nmzhjtwx.com

Guangdong

http://www.nmzhjtwx.com

服务,中间件,脚本语言,开发框架,脚本语言

######

https://www.sgyxbaby.com

67.201.3.195

emc全站网页版 - emc全站网页下载

sgyxbaby.com

Arizona

https://www.sgyxbaby.com

服务,脚本语言,中间件

######

https://www.ksrzzy.com

23.81.4.196

bat365在线平台登录网址 - bat365在线平台网站

ksrzzy.com

Washington

https://www.ksrzzy.com

服务,中间件,脚本语言

######

ozvys.flemingtonhouses.com

38.63.251.248

香港老凤祥黄金首饰价格走势图

flemingtonhouses.com

California

http://ozvys.flemingtonhouses.com

服务,中间件,脚本语言

######

你现在想把某一种类的数据汇总到一起,

你想要把这一百分中每一份的link这一列的数据提取出来汇总。

link

http://www.qiushipharma.com

http://www.iypxedu.com

http://www.nmzhjtwx.com

https://www.sgyxbaby.com

https://www.ksrzzy.com

http://ozvys.flemingtonhouses.com

代码如下

import pandas as pd
import os

# 读取所有CSV文件并提取链接写入到links.txt
directory = '/root/桌面/8W/'  # 将目录路径替换为你实际的目录路径
with open('links.txt', 'w') as f:
    for filename in os.listdir(directory):
        if filename.endswith(".csv"):
            filepath = os.path.join(directory, filename)
            print(f"Reading file: {filepath}")
            df = pd.read_csv(filepath, sep=',', encoding='utf-8')  # 使用逗号作为分隔符
            if 'link' in df.columns:  # 检查是否存在link列
                for link in df['link']:
                    f.write(link + '\n')  # 将链接写入到文件中
            print(f"Links extracted from {filename} and written to links.txt")

运行

└─# python3 1.py
Reading file: /root/桌面/8W/c3e5614998_202405132137资产数据.csv
Links extracted from c3e5614998_202405132137资产数据.csv and written to links.txt
Reading file: /root/桌面/8W/948089af51_202405132226资产数据.csv
Links extracted from 948089af51_202405132226资产数据.csv and written to links.txt
Reading file: /root/桌面/8W/7339e72143_202405132140资产数据.csv
Links extracted from 7339e72143_202405132140资产数据.csv and written to links.txt
Reading file: /root/桌面/8W/2bbab7ac2a_202405132131资产数据.csv
Links extracted from 2bbab7ac2a_202405132131资产数据.csv and written to links.txt
Reading file: /root/桌面/8W/94f3b4aab7_202405132138资产数据.csv
......

处理完后查看link.txt即可。

-.-

0

评论区