Python, memory error, csv file too large
本问题已经有最佳答案,请猛点这里访问。
我有一个python模块的问题,无法处理导入大数据文件(文件targets.csv权重接近1 Gb)
加载此行时出现错误:
1 2 | targets = [(name, float(X), float(Y), float(Z), float(BG)) for name, X, Y, Z, BG in csv.reader(open('targets.csv'))] |
追溯:
1 2 3 4 | Traceback (most recent call last): File"C:\Users\gary\Documents\EPSON STUDIES\colors_text_D65.py", line 41, in <module> for name, X, Y, Z, BG in csv.reader(open('targets.csv'))] MemoryError |
我想知道是否有办法逐行打开文件targets.csv? 并且还想知道这会减慢这个过程吗?
这个模块已经很慢......
谢谢!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | import geometry import csv import numpy as np import random import cv2 S = 0 img = cv2.imread("MAP.tif", -1) height, width = img.shape pixx = height * width iterr = float(pixx / 1000) accomplished = 0 temp = 0 ppm = file("epson gamut.ppm", 'w') ppm.write("P3" +" " + str(width) +"" + str(height) +" " +"255" +" ") # PPM file header all_colors = [(name, float(X), float(Y), float(Z)) for name, X, Y, Z in csv.reader(open('XYZcolorlist_D65.csv'))] # background is marked SUPPORT support_i = [i for i, color in enumerate(all_colors) if color[0] == '255 255 255'] if len(support_i)>0: support = np.array(all_colors[support_i[0]][1:]) del all_colors[support_i[0]] else: support = None tg, hull_i = geometry.tetgen_of_hull([(X,Y,Z) for name, X, Y, Z in all_colors]) colors = [all_colors[i] for i in hull_i] print ("thrown out:" +",".join(set(zip(*all_colors)[0]).difference(zip(*colors)[0]))) targets = [(name, float(X), float(Y), float(Z), float(BG)) for name, X, Y, Z, BG in csv.reader(open('targets.csv'))] for target in targets: name, X, Y, Z, BG = target target_point = support + (np.array([X,Y,Z]) - support)/(1-BG) tet_i, bcoords = geometry.containing_tet(tg, target_point) if tet_i == None: #print str("out") ppm.write(str("255 255 255") +" ") print"out" temp += 1 if temp >= iterr: accomplished += temp print str(100 * accomplished / (float(pixx))) + str(" %") temp = 0 continue # not in gamut else: A = bcoords[0] B = bcoords[1] C = bcoords[2] D = bcoords[3] R = random.uniform(0,1) names = [colors[i][0] for i in tg.tets[tet_i]] if R <= A: S = names[0] elif R <= A+B: S = names[1] elif R <= A+B+C: S = names[2] else: S = names[3] ppm.write(str(S) +" ") temp += 1 if temp >= iterr: accomplished += temp print str(100 * accomplished / (float(pixx))) + str(" %") temp = 0 print"done" ppm.close() |
1 2 | targets = ((name, float(X), float(Y), float(Z), float(BG)) for name, X, Y, Z, BG in csv.reader(open('targets.csv'))) |
(从方括号切换到parens应该将