creating corresponding subfolders and writing a portion of the file in new files inside those subfolders using python

I have a folder named "data". It contains subfolders "data_1", "data_2", and "data_3". These subfolders contain some text files. I want to parse through all these subfolders and generate corresponding subfolders with the same name, inside another folder named "processed_data". I want to also generate corresponding files with "processed" as a prefix in the name and want to write all those lines from the original file where "1293" is there in the original files.
I am using the below code but not able to get the required result. Neither the subfolders "data_1", "data_2", and "data_3" nor the files are getting created

import os folder_name="" def pre_processor(): data_location="D:data" # folder containing all the data for root, dirs, files in os.walk(data_location): for dir in dirs: #folder_name="" folder_name=dir for filename in files: with open(os.path.join(root, filename),encoding="utf8",mode="r") as f: processed_file_name = 'D:\processed_data\'+folder_name+'\'+'processed'+filename processed_file = open(processed_file_name,"w", encoding="utf8") for line_number, line in enumerate(f, 1): if "1293" in line: processed_file.write(str(line)) processed_file.close() pre_processor()

2 Answers
2

You might need to elaborate on the issue you are having; e.g., are the files being created, but empty?

A few things I notice:
1) Your indentation is off (not sure if this is just a copy-paste issue though): the pre_processor function is empty, i.e. you are defining the function at the same level as the declaration, not inside of it.
try this:

import os folder_name="" def pre_processor(): data_location="D:data" # folder containing all the data for root, dirs, files in os.walk(data_location): for dir in dirs: #folder_name="" folder_name=dir for filename in files: with open(os.path.join(root, filename), encoding="utf8",mode="r") as f: processed_file_name = 'D:\processed_data\'+folder_name+'\'+'processed'+filename processed_file = open(processed_file_name,"w", encoding="utf8") for line_number, line in enumerate(f, 1): if "1293" in line: processed_file.write(str(line)) processed_file.close() pre_processor()

2) Check if the processed_data and sub_folders exist; if not, create them first as this will not do so.

Yes, it was a copy-paste issue. corrected the questions. Thanks!

– Slickmind
Sep 17 '18 at 14:47

Instead of creating the path to the new Folder by hand you could just replace the name of the folder.
Furthermore, you are not creating the subfolders.

This code should work but replace the Linux folder slashes:

import os folder_name="" def pre_processor(): data_location="data" # folder containing all the data for root, dirs, files in os.walk(data_location): for dir in dirs: # folder_name="" folder_name = dir for filename in files: joined_path = os.path.join(root, filename) with open(joined_path, encoding="utf8", mode="r") as f: processed_folder_name = root.replace("data/", 'processed_data/') processed_file_name = processed_folder_name+'/processed'+filename if not os.path.exists(processed_folder_name): os.makedirs(processed_folder_name) processed_file = open(processed_file_name, "w", encoding="utf8") for line in f: if "1293" in line: processed_file.write(str(line)) processed_file.close() pre_processor()

Thanks for contributing an answer to Stack Overflow!

But avoid …

To learn more, see our tips on writing great answers.

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service, privacy policy and cookie policy

搜尋此網誌

Dfyjkt