Python 編程高手都在用的十個(gè)文件處理技巧，讓你效率翻倍！

作者：手把手PythonAI編程 2025-07-11 01:05:41

開發(fā)

本文介紹 Python 編程高手都在用的十個(gè)文件處理技巧，掌握Python基礎(chǔ)語法 (變量/循環(huán)/函數(shù)) ，具備文件讀寫基礎(chǔ)操作經(jīng)驗(yàn)的開發(fā)者。

前言

批量自動(dòng)化處理：日均處理1000+文件時(shí)，可節(jié)省80%人工操作時(shí)間
跨平臺(tái)兼容性：統(tǒng)一處理Windows/Linux/Mac文件路徑問題
數(shù)據(jù)預(yù)處理基礎(chǔ)：為數(shù)據(jù)分析/機(jī)器學(xué)習(xí)提供結(jié)構(gòu)化數(shù)據(jù)輸入

適用人群：掌握Python基礎(chǔ)語法 (變量/循環(huán)/函數(shù)) ，具備文件讀寫基礎(chǔ)操作經(jīng)驗(yàn)的開發(fā)者。

技巧① 智能路徑拼接

使用os.path.join()替代字符串拼接

??示例：

import os
# 標(biāo)準(zhǔn)版
path = os.path.join("data", "2023", "logs.txt")  # 輸出: data\2023\logs.txt (Windows)
# 優(yōu)化版 (Python 3.4+) 
from pathlib import Path
path = Path("data") / "2023" / "logs.txt"  # 自動(dòng)適配系統(tǒng)路徑分隔符

?? 警告：直接使用+拼接路徑可能導(dǎo)致跨平臺(tái)異常

技巧② 上下文管理器

with語句自動(dòng)釋放資源

??示例：

# 標(biāo)準(zhǔn)版
f = open("data.txt", "r")
content = f.read()
f.close()

# 優(yōu)化版
with open("data.txt", "r") as f:  # 文件在代碼塊結(jié)束后自動(dòng)關(guān)閉
    content = f.read()

? 優(yōu)勢(shì)：防止文件句柄泄漏，建議100%使用

技巧③ 批量文件過濾

使用glob通配符匹配

??示例：

import glob
# 匹配所有CSV文件
csv_files = glob.glob("data/*.csv")
# 遞歸搜索子目錄
all_logs = glob.glob("**/*.log", recursive=True)

?? 參數(shù)說明：*匹配任意字符，?匹配單個(gè)字符，[0-9]匹配數(shù)字范圍

技巧④ 文件內(nèi)容迭代

逐行處理大文件

??示例：

with open("bigfile.txt", "r") as f:
    for line in f:  # 內(nèi)存占用僅1行數(shù)據(jù)
        process(line)  # 自定義處理函數(shù)

?? 性能對(duì)比：處理1GB文件時(shí)，逐行讀取比readlines()快3倍

技巧⑤ 二進(jìn)制文件處理

使用rb/wb模式處理非文本文件

??示例：

with open("image.png", "rb") as f:  # 二進(jìn)制模式
    data = f.read()
with open("copy.png", "wb") as f:  # 寫入二進(jìn)制
    f.write(data)

?? 適用場景:圖片/視頻/加密文件傳輸

技巧⑥ 文件編碼聲明

顯式指定編碼格式

??示例：

with open("chinese.txt", "r", encoding="utf-8") as f:  # 指定編碼
    text = f.read()

# 處理未知編碼文件
import chardet
with open("unknown.txt", "rb") as f:
    result = chardet.detect(f.read()) 
    encoding = result['encoding']

??常見編碼：utf-8/gbk/latin-1

技巧⑦ 文件元信息

獲取文件屬性信息

??示例：

import os
stat_info = os.stat("data.txt")
print(stat_info.st_size)  # 文件大小
print(stat_info.st_mtime) # 最后修改時(shí)間

? 時(shí)間戳轉(zhuǎn)換：time.ctime(stat_info.st_ctime)

技巧⑧ 批量重命名

使用os.rename()實(shí)現(xiàn)自動(dòng)化

??示例：

import os
for i, filename in enumerate(os.listdir()):
    if filename.endswith(".txt"):
        new_name = f"log_{i:03d}.txt"  # 001.txt格式
        os.rename(filename, new_name)

?? 改進(jìn)方案：使用re模塊實(shí)現(xiàn)正則表達(dá)式重命名

技巧⑨ 文件壓縮處理

使用zipfile模塊

??示例：

import zipfile
# 創(chuàng)建壓縮包
with zipfile.ZipFile("archive.zip", "w") as zipf:
    zipf.write("data.txt")

# 解壓文件
with zipfile.ZipFile("archive.zip", "r") as zipf:
    zipf.extractall("extracted")

?? 壓縮加密：添加pwd=b"password"參數(shù)

技巧⑩ 內(nèi)存映射文件

處理超大文件讀取

??示例：

import mmap
with open("huge.bin", "r+b") as f:
    mm = mmap.mmap(f.fileno(), 0)  # 內(nèi)存映射
    print(mm.find(b"pattern"))    # 快速查找
    mm.close()

?? 適用場景：處理500MB+二進(jìn)制文件

實(shí)戰(zhàn)案例：日志文件分析

from collections import defaultdict

def analyze_logs(log_dir):
    stats = defaultdict(int)
    for log_file in glob.glob(f"{log_dir}/*.log"):
        with open(log_file, "r") as f:
            for line in f:
                if"ERROR"in line:
                    stats["errors"] += 1
                elif"WARNING"in line:
                    stats["warnings"] += 1
    return dict(stats)

print(analyze_logs("server_logs"))
# 輸出: {'errors': 127, 'warnings': 45}

?? 分析要點(diǎn):逐行處理+狀態(tài)統(tǒng)計(jì)+批量文件遍歷

責(zé)任編輯：趙寧寧來源：手把手PythonAI編程

Python 編程開發(fā)