Flattening multiple objects in a JSON file
I am trying to flatten a JSON using python, but many JSON files have more than one object and due to this, the python script is not able to flatten it.
Eg:
........
......
There are 3000 files where this is an issue, I can't even manually go and put brackets and make it a list, how can I flatten such JSON files please?
Current Code I am using:
import json
import pandas as pd
from pandas.io.json import json_normalize
import glob
import os
import csv
import xlsxwriter
counter=0
counter1=0
writer = pd.ExcelWriter('......', engine = 'xlsxwriter')
path = '/Users/.......''
dataframeLS =
for subfolder in os.listdir(path):
for filename in glob.glob(os.path.join (path+subfolder, '*.json')):
with open(filename) as f:
try:
data = json.load(f)
dataframeLS.append(data)
counter1=counter1+1
except:
counter=counter+1
pass
A = pd.io.json.json_normalize(dataframeLS, errors = 'ignore')
A.to_excel(writer, sheet_name = 'Sheet1')
writer.save()
print(counter) # number of files with error
print(counter1) # number of flattened files successfully
python json flatten
add a comment |
I am trying to flatten a JSON using python, but many JSON files have more than one object and due to this, the python script is not able to flatten it.
Eg:
........
......
There are 3000 files where this is an issue, I can't even manually go and put brackets and make it a list, how can I flatten such JSON files please?
Current Code I am using:
import json
import pandas as pd
from pandas.io.json import json_normalize
import glob
import os
import csv
import xlsxwriter
counter=0
counter1=0
writer = pd.ExcelWriter('......', engine = 'xlsxwriter')
path = '/Users/.......''
dataframeLS =
for subfolder in os.listdir(path):
for filename in glob.glob(os.path.join (path+subfolder, '*.json')):
with open(filename) as f:
try:
data = json.load(f)
dataframeLS.append(data)
counter1=counter1+1
except:
counter=counter+1
pass
A = pd.io.json.json_normalize(dataframeLS, errors = 'ignore')
A.to_excel(writer, sheet_name = 'Sheet1')
writer.save()
print(counter) # number of files with error
print(counter1) # number of flattened files successfully
python json flatten
4
Can you show an example of some code? It will be easier to help you.
– sashaaero
Nov 12 '18 at 21:26
1
I think I get the issue but it'd be better if you put some examples like an input json file, actual output and the expected output :)
– Saul Cruz
Nov 12 '18 at 21:27
Sorry guys this is my first post, have added the code which I am using, thanks
– vb2489
Nov 12 '18 at 21:39
add a comment |
I am trying to flatten a JSON using python, but many JSON files have more than one object and due to this, the python script is not able to flatten it.
Eg:
........
......
There are 3000 files where this is an issue, I can't even manually go and put brackets and make it a list, how can I flatten such JSON files please?
Current Code I am using:
import json
import pandas as pd
from pandas.io.json import json_normalize
import glob
import os
import csv
import xlsxwriter
counter=0
counter1=0
writer = pd.ExcelWriter('......', engine = 'xlsxwriter')
path = '/Users/.......''
dataframeLS =
for subfolder in os.listdir(path):
for filename in glob.glob(os.path.join (path+subfolder, '*.json')):
with open(filename) as f:
try:
data = json.load(f)
dataframeLS.append(data)
counter1=counter1+1
except:
counter=counter+1
pass
A = pd.io.json.json_normalize(dataframeLS, errors = 'ignore')
A.to_excel(writer, sheet_name = 'Sheet1')
writer.save()
print(counter) # number of files with error
print(counter1) # number of flattened files successfully
python json flatten
I am trying to flatten a JSON using python, but many JSON files have more than one object and due to this, the python script is not able to flatten it.
Eg:
........
......
There are 3000 files where this is an issue, I can't even manually go and put brackets and make it a list, how can I flatten such JSON files please?
Current Code I am using:
import json
import pandas as pd
from pandas.io.json import json_normalize
import glob
import os
import csv
import xlsxwriter
counter=0
counter1=0
writer = pd.ExcelWriter('......', engine = 'xlsxwriter')
path = '/Users/.......''
dataframeLS =
for subfolder in os.listdir(path):
for filename in glob.glob(os.path.join (path+subfolder, '*.json')):
with open(filename) as f:
try:
data = json.load(f)
dataframeLS.append(data)
counter1=counter1+1
except:
counter=counter+1
pass
A = pd.io.json.json_normalize(dataframeLS, errors = 'ignore')
A.to_excel(writer, sheet_name = 'Sheet1')
writer.save()
print(counter) # number of files with error
print(counter1) # number of flattened files successfully
python json flatten
python json flatten
edited Nov 12 '18 at 21:38
vb2489
asked Nov 12 '18 at 21:21
vb2489vb2489
12
12
4
Can you show an example of some code? It will be easier to help you.
– sashaaero
Nov 12 '18 at 21:26
1
I think I get the issue but it'd be better if you put some examples like an input json file, actual output and the expected output :)
– Saul Cruz
Nov 12 '18 at 21:27
Sorry guys this is my first post, have added the code which I am using, thanks
– vb2489
Nov 12 '18 at 21:39
add a comment |
4
Can you show an example of some code? It will be easier to help you.
– sashaaero
Nov 12 '18 at 21:26
1
I think I get the issue but it'd be better if you put some examples like an input json file, actual output and the expected output :)
– Saul Cruz
Nov 12 '18 at 21:27
Sorry guys this is my first post, have added the code which I am using, thanks
– vb2489
Nov 12 '18 at 21:39
4
4
Can you show an example of some code? It will be easier to help you.
– sashaaero
Nov 12 '18 at 21:26
Can you show an example of some code? It will be easier to help you.
– sashaaero
Nov 12 '18 at 21:26
1
1
I think I get the issue but it'd be better if you put some examples like an input json file, actual output and the expected output :)
– Saul Cruz
Nov 12 '18 at 21:27
I think I get the issue but it'd be better if you put some examples like an input json file, actual output and the expected output :)
– Saul Cruz
Nov 12 '18 at 21:27
Sorry guys this is my first post, have added the code which I am using, thanks
– vb2489
Nov 12 '18 at 21:39
Sorry guys this is my first post, have added the code which I am using, thanks
– vb2489
Nov 12 '18 at 21:39
add a comment |
1 Answer
1
active
oldest
votes
So a sample file would look like this ?
"temp":1
"temp":2
To import this into python, you could read the file as text first and then fix the syntax errors. If you use a regex, you can deal even with irregular whitespace.
The regex looks like this:
}[^,]*
matches a closing}[^,]matches any character except,[^,]*matches as many characters except,as possiblematches the opening
Together, this matches a pair of brackets with no comma between them.
You can use the regex to replace this text with properly formatted text:
import json
import re
file = open("bad_json.json", "r")
text = file.read()
file.close()
text # 'ntemp:1nnntemp:2n'
# we convert it to an array
# add outer brackets
text = "["+text+"]"
# use the regex to replace bad formatting
pattern = re.compile(r"[^,]*")
new_text = re.sub(pattern,",
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53270299%2fflattening-multiple-objects-in-a-json-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
So a sample file would look like this ?
"temp":1
"temp":2
To import this into python, you could read the file as text first and then fix the syntax errors. If you use a regex, you can deal even with irregular whitespace.
The regex looks like this:
[^,]*
matches a closing[^,]matches any character except,[^,]*matches as many characters except,as possiblematches the opening
Together, this matches a pair of brackets with no comma between them.
You can use the regex to replace this text with properly formatted text:
import json
import re
file = open("bad_json.json", "r")
text = file.read()
file.close()
text # 'ntemp:1nnntemp:2n'
# we convert it to an array
# add outer brackets
text = "["+text+"]"
# use the regex to replace bad formatting
pattern = re.compile(r"[^,]*")
new_text = re.sub(pattern,",", text)
data = json.loads(new_text)
So a sample file would look like this ?
"temp":1
"temp":2
To import this into python, you could read the file as text first and then fix the syntax errors. If you use a regex, you can deal even with irregular whitespace.
The regex looks like this:
[^,]*
matches a closing[^,]matches any character except,[^,]*matches as many characters except,as possiblematches the opening
Together, this matches a pair of brackets with no comma between them.
You can use the regex to replace this text with properly formatted text:
import json
import re
file = open("bad_json.json", "r")
text = file.read()
file.close()
text # 'ntemp:1nnntemp:2n'
# we convert it to an array
# add outer brackets
text = "["+text+"]"
# use the regex to replace bad formatting
pattern = re.compile(r"[^,]*")
new_text = re.sub(pattern,",", text)
data = json.loads(new_text)
edited Nov 13 '18 at 6:56
answered Nov 12 '18 at 21:38
lhklhk
7,080105691
7,080105691
Hi, solution is not working bcos of irregular whitespaces spaces between two curly brackets, any solution on how to solve this pls ?
– vb2489
Nov 12 '18 at 23:29
I updated it to deal with arbitrary whitespace :)
– lhk
Nov 13 '18 at 6:56
Hi @lhk, the sample file is actually looking like this: "temp":1 "temp":2 Like the end of temp 1 curly bracket and the start of temp 2 curly bracket are on the Same Line and not on the another line, I hope this info makes some difference, as the code is still not working.
– vb2489
Nov 13 '18 at 14:49
That's strange. If I take your sample (the one from your comment, brackets on the same line) and plug it into my code, it works just fine. You can test it by replacing the file loading part with text = ' "temp":1 "temp":2 ' . Did you copy my exact code ?
– lhk
Nov 13 '18 at 15:06
yes, I did, let me try that, thanks
– vb2489
Nov 13 '18 at 15:22
|
show 2 more comments
Hi, solution is not working bcos of irregular whitespaces spaces between two curly brackets, any solution on how to solve this pls ?
– vb2489
Nov 12 '18 at 23:29
I updated it to deal with arbitrary whitespace :)
– lhk
Nov 13 '18 at 6:56
Hi @lhk, the sample file is actually looking like this: "temp":1 "temp":2 Like the end of temp 1 curly bracket and the start of temp 2 curly bracket are on the Same Line and not on the another line, I hope this info makes some difference, as the code is still not working.
– vb2489
Nov 13 '18 at 14:49
That's strange. If I take your sample (the one from your comment, brackets on the same line) and plug it into my code, it works just fine. You can test it by replacing the file loading part with text = ' "temp":1 "temp":2 ' . Did you copy my exact code ?
– lhk
Nov 13 '18 at 15:06
yes, I did, let me try that, thanks
– vb2489
Nov 13 '18 at 15:22
Hi, solution is not working bcos of irregular whitespaces spaces between two curly brackets, any solution on how to solve this pls ?
– vb2489
Nov 12 '18 at 23:29
Hi, solution is not working bcos of irregular whitespaces spaces between two curly brackets, any solution on how to solve this pls ?
– vb2489
Nov 12 '18 at 23:29
I updated it to deal with arbitrary whitespace :)
– lhk
Nov 13 '18 at 6:56
I updated it to deal with arbitrary whitespace :)
– lhk
Nov 13 '18 at 6:56
Hi @lhk, the sample file is actually looking like this: "temp":1 "temp":2 Like the end of temp 1 curly bracket and the start of temp 2 curly bracket are on the Same Line and not on the another line, I hope this info makes some difference, as the code is still not working.
– vb2489
Nov 13 '18 at 14:49
Hi @lhk, the sample file is actually looking like this: "temp":1 "temp":2 Like the end of temp 1 curly bracket and the start of temp 2 curly bracket are on the Same Line and not on the another line, I hope this info makes some difference, as the code is still not working.
– vb2489
Nov 13 '18 at 14:49
That's strange. If I take your sample (the one from your comment, brackets on the same line) and plug it into my code, it works just fine. You can test it by replacing the file loading part with text = ' "temp":1 "temp":2 ' . Did you copy my exact code ?
– lhk
Nov 13 '18 at 15:06
That's strange. If I take your sample (the one from your comment, brackets on the same line) and plug it into my code, it works just fine. You can test it by replacing the file loading part with text = ' "temp":1 "temp":2 ' . Did you copy my exact code ?
– lhk
Nov 13 '18 at 15:06
yes, I did, let me try that, thanks
– vb2489
Nov 13 '18 at 15:22
yes, I did, let me try that, thanks
– vb2489
Nov 13 '18 at 15:22
|
show 2 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53270299%2fflattening-multiple-objects-in-a-json-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
Can you show an example of some code? It will be easier to help you.
– sashaaero
Nov 12 '18 at 21:26
1
I think I get the issue but it'd be better if you put some examples like an input json file, actual output and the expected output :)
– Saul Cruz
Nov 12 '18 at 21:27
Sorry guys this is my first post, have added the code which I am using, thanks
– vb2489
Nov 12 '18 at 21:39