Search a word in docx file and copy the file in keyword folder

Search a word in docx file and copy the file in keyword folder



I am doing a task in which I have a text file containing specific keyword likes (C#,Angular,Python,Rest,MySQL) and have to create folder. After this I have to search these keyword in resumes given and if found copy it to each folder.



For example A has skill of C# and Angular so his/her Resume will be in both folder.



I have completed the folder creation , I need help in search the word in .docx file and copying to requested folder. I have looked for online stuffs but unable to proceed. Can anyone provide me some lead how to search word/string in docs. file



Here is my code:



Folder creation


InputFile = "ResumeKeyword.txt"
fileOpen = open(InputFile)

for keyword in fileOpen.readline().split(','):
print(keyword)
os.makedirs(keyword)

fileOpen.close()



And for reading the docx


from docx import Document
document = Document('A.docx')
word = "Angular"
for x in document.paragraphs:
print(x.text)




1 Answer
1



Hope this helps:


from docx import Document
from shutil import copyfile
import os, re, random

# Folder which contains all the resumes
ALL_RESUMES = "all_resumes/"
# The folder which will contain the separated resumes
SEGREGATED_RESUMES = "topic_wise_resumes/"

def get_keywords(keywords_file, create_new = False):
"""
Get all keywords from file keywords_file. We get all keywords in lower case to remove confusion down the line.
"""
fileOpen = open(keywords_file, "r")
words = [x.strip().lower() for x in fileOpen.readline().split(',')]

keywords =
for keyword in words:
keywords.append(keyword)
if(not(os.path.isdir(SEGREGATED_RESUMES))):
os.makedirs(SEGREGATED_RESUMES + keyword)

return keywords

def segregate_resumes(keywords):
"""
Copy the resumes to the appropriate folders
"""
# The pattern for regex match
keyword_pattern = "|".join(keywords)

# All resumes
for filename in os.listdir(ALL_RESUMES):
# basic sanity check
if filename.endswith(".docx"):
document = Document(ALL_RESUMES + filename)
all_texts =
for p in document.paragraphs:
all_texts.append(p.text)

# The entire text in the resume in lowercase
all_words_in_resume = " ".join(all_texts).lower()

# The matching keywords
matches = re.findall(keyword_pattern, all_words_in_resume)

# Copy the resume to the keyword folder
for match in matches:
copyfile(ALL_RESUMES + filename, SEGREGATED_RESUMES + match + "/" + filename)

def create_sample_resumes(keywords, num = 5):
"""
Function to create sample resumes for testing
"""
for i in range(num):
document = Document()
document.add_heading('RESUME'.format(i))
skills_ = random.sample(keywords, 2)
document.add_paragraph("I have skills - and ".format(*skills_))
document.save(ALL_RESUMES + "resume.docx".format(i))

keywords = get_keywords("ResumeKeyword.txt")
print(keywords)
# create_sample_resumes(keywords)
segregate_resumes(keywords)





Thank You. It's works perfectly well.
– Sunny Prakash
Sep 1 at 7:25



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Some of your past answers have not been well-received, and you're in danger of being blocked from answering.



Please pay close attention to the following guidance:



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)