Executing Python script in Azure ML studio
I wanted to create a webservice which will provide a summary of texts in the given URL using python , beautifulsoup and nltk.
However I encounter the following error in Azure ML Studio
Schematics in AZURE:
EnterData Module is having URL from wiki
Execute Python Script is having following code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1[0].to_string()
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns =['article_text'])
return dataframe1,
Running this experiment producing following error:
Error 0085: The following error occurred during script evaluation, please view the output log for more information:
---------- Start of error message from Python interpreter ----------
Caught exception while executing function: Traceback (most recent call last):
File "C:pyhomelibsite-packagespandasindexesbase.py", line 1876, in get_loc
return self._engine.get_loc(key)
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4027)
File "pandasindex.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandasindex.c:3891)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12408)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12359)
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:serverinvokepy.py", line 199, in batch
odfs = mod.azureml_main(*idfs)
File "C:temp84d7e9fbcfe54596a2e7de022b4d236c.py", line 23, in azureml_main
wiki = dataframe1[0][0].to_string()
File "C:pyhomelibsite-packagespandascoreframe.py", line 1992, in __getitem__
return self._getitem_column(key)
File "C:pyhomelibsite-packagespandascoreframe.py", line 1999, in _getitem_column
return self._get_item_cache(key)
File "C:pyhomelibsite-packagespandascoregeneric.py", line 1345, in _get_item_cache
values = self._data.get(item)
File "C:pyhomelibsite-packagespandascoreinternals.py", line 3225, in get
loc = self.items.get_loc(item)
File "C:pyhomelibsite-packagespandasindexesbase.py", line 1878, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4027)
File "pandasindex.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandasindex.c:3891)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12408)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12359)
KeyError: 0
Process returned with non-zero exit code 1
---------- End of error message from Python interpreter ----------
Start time: UTC 11/11/2018 15:34:21
End time: UTC 11/11/2018 15:34:30
- I am using Anaconda 4.0/Python 3.5 to run this snippet.
- when I assign the URL to the variable wiki, the code runs successfully in my local machine
- I am not sure why I cannot fetch the value from the input dataframe1.
- Input dataframe is not having header hence dataframe1[0] should fetch the URL directly..
Thanks to help me on this.
python pandas beautifulsoup
add a comment |
I wanted to create a webservice which will provide a summary of texts in the given URL using python , beautifulsoup and nltk.
However I encounter the following error in Azure ML Studio
Schematics in AZURE:
EnterData Module is having URL from wiki
Execute Python Script is having following code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1[0].to_string()
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns =['article_text'])
return dataframe1,
Running this experiment producing following error:
Error 0085: The following error occurred during script evaluation, please view the output log for more information:
---------- Start of error message from Python interpreter ----------
Caught exception while executing function: Traceback (most recent call last):
File "C:pyhomelibsite-packagespandasindexesbase.py", line 1876, in get_loc
return self._engine.get_loc(key)
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4027)
File "pandasindex.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandasindex.c:3891)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12408)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12359)
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:serverinvokepy.py", line 199, in batch
odfs = mod.azureml_main(*idfs)
File "C:temp84d7e9fbcfe54596a2e7de022b4d236c.py", line 23, in azureml_main
wiki = dataframe1[0][0].to_string()
File "C:pyhomelibsite-packagespandascoreframe.py", line 1992, in __getitem__
return self._getitem_column(key)
File "C:pyhomelibsite-packagespandascoreframe.py", line 1999, in _getitem_column
return self._get_item_cache(key)
File "C:pyhomelibsite-packagespandascoregeneric.py", line 1345, in _get_item_cache
values = self._data.get(item)
File "C:pyhomelibsite-packagespandascoreinternals.py", line 3225, in get
loc = self.items.get_loc(item)
File "C:pyhomelibsite-packagespandasindexesbase.py", line 1878, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4027)
File "pandasindex.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandasindex.c:3891)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12408)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12359)
KeyError: 0
Process returned with non-zero exit code 1
---------- End of error message from Python interpreter ----------
Start time: UTC 11/11/2018 15:34:21
End time: UTC 11/11/2018 15:34:30
- I am using Anaconda 4.0/Python 3.5 to run this snippet.
- when I assign the URL to the variable wiki, the code runs successfully in my local machine
- I am not sure why I cannot fetch the value from the input dataframe1.
- Input dataframe is not having header hence dataframe1[0] should fetch the URL directly..
Thanks to help me on this.
python pandas beautifulsoup
add a comment |
I wanted to create a webservice which will provide a summary of texts in the given URL using python , beautifulsoup and nltk.
However I encounter the following error in Azure ML Studio
Schematics in AZURE:
EnterData Module is having URL from wiki
Execute Python Script is having following code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1[0].to_string()
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns =['article_text'])
return dataframe1,
Running this experiment producing following error:
Error 0085: The following error occurred during script evaluation, please view the output log for more information:
---------- Start of error message from Python interpreter ----------
Caught exception while executing function: Traceback (most recent call last):
File "C:pyhomelibsite-packagespandasindexesbase.py", line 1876, in get_loc
return self._engine.get_loc(key)
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4027)
File "pandasindex.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandasindex.c:3891)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12408)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12359)
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:serverinvokepy.py", line 199, in batch
odfs = mod.azureml_main(*idfs)
File "C:temp84d7e9fbcfe54596a2e7de022b4d236c.py", line 23, in azureml_main
wiki = dataframe1[0][0].to_string()
File "C:pyhomelibsite-packagespandascoreframe.py", line 1992, in __getitem__
return self._getitem_column(key)
File "C:pyhomelibsite-packagespandascoreframe.py", line 1999, in _getitem_column
return self._get_item_cache(key)
File "C:pyhomelibsite-packagespandascoregeneric.py", line 1345, in _get_item_cache
values = self._data.get(item)
File "C:pyhomelibsite-packagespandascoreinternals.py", line 3225, in get
loc = self.items.get_loc(item)
File "C:pyhomelibsite-packagespandasindexesbase.py", line 1878, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4027)
File "pandasindex.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandasindex.c:3891)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12408)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12359)
KeyError: 0
Process returned with non-zero exit code 1
---------- End of error message from Python interpreter ----------
Start time: UTC 11/11/2018 15:34:21
End time: UTC 11/11/2018 15:34:30
- I am using Anaconda 4.0/Python 3.5 to run this snippet.
- when I assign the URL to the variable wiki, the code runs successfully in my local machine
- I am not sure why I cannot fetch the value from the input dataframe1.
- Input dataframe is not having header hence dataframe1[0] should fetch the URL directly..
Thanks to help me on this.
python pandas beautifulsoup
I wanted to create a webservice which will provide a summary of texts in the given URL using python , beautifulsoup and nltk.
However I encounter the following error in Azure ML Studio
Schematics in AZURE:
EnterData Module is having URL from wiki
Execute Python Script is having following code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1[0].to_string()
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns =['article_text'])
return dataframe1,
Running this experiment producing following error:
Error 0085: The following error occurred during script evaluation, please view the output log for more information:
---------- Start of error message from Python interpreter ----------
Caught exception while executing function: Traceback (most recent call last):
File "C:pyhomelibsite-packagespandasindexesbase.py", line 1876, in get_loc
return self._engine.get_loc(key)
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4027)
File "pandasindex.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandasindex.c:3891)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12408)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12359)
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:serverinvokepy.py", line 199, in batch
odfs = mod.azureml_main(*idfs)
File "C:temp84d7e9fbcfe54596a2e7de022b4d236c.py", line 23, in azureml_main
wiki = dataframe1[0][0].to_string()
File "C:pyhomelibsite-packagespandascoreframe.py", line 1992, in __getitem__
return self._getitem_column(key)
File "C:pyhomelibsite-packagespandascoreframe.py", line 1999, in _getitem_column
return self._get_item_cache(key)
File "C:pyhomelibsite-packagespandascoregeneric.py", line 1345, in _get_item_cache
values = self._data.get(item)
File "C:pyhomelibsite-packagespandascoreinternals.py", line 3225, in get
loc = self.items.get_loc(item)
File "C:pyhomelibsite-packagespandasindexesbase.py", line 1878, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4027)
File "pandasindex.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandasindex.c:3891)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12408)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12359)
KeyError: 0
Process returned with non-zero exit code 1
---------- End of error message from Python interpreter ----------
Start time: UTC 11/11/2018 15:34:21
End time: UTC 11/11/2018 15:34:30
- I am using Anaconda 4.0/Python 3.5 to run this snippet.
- when I assign the URL to the variable wiki, the code runs successfully in my local machine
- I am not sure why I cannot fetch the value from the input dataframe1.
- Input dataframe is not having header hence dataframe1[0] should fetch the URL directly..
Thanks to help me on this.
python pandas beautifulsoup
python pandas beautifulsoup
edited Nov 12 '18 at 11:05
ewwink
11.9k22239
11.9k22239
asked Nov 12 '18 at 3:27
user7434438user7434438
113
113
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
your dataframe1
is look like this
dataframe1 = 'Col1' : ['https://en.wikipedia.org/wiki/Finite_element_method']
the key is not index (int), but its 'Col1'
, you can fix it with
wiki = dataframe1['Col1'].to_string(index=0)
but it raise another error, the URL is trimmed if too long
https://en.wikipedia.org/wiki/Finite_element....
so it better using
wiki = dataframe1['Col1'][0]
another error is
return dataframe1,
it should be
return df,
fixed code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1['Col1'][0]
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns=['article_text'])
return df,
Thank you so much! it worked.
– user7434438
Nov 12 '18 at 16:08
ur welcome, please consider mark the answer correct.
– ewwink
Nov 12 '18 at 21:03
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53255586%2fexecuting-python-script-in-azure-ml-studio%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
your dataframe1
is look like this
dataframe1 = 'Col1' : ['https://en.wikipedia.org/wiki/Finite_element_method']
the key is not index (int), but its 'Col1'
, you can fix it with
wiki = dataframe1['Col1'].to_string(index=0)
but it raise another error, the URL is trimmed if too long
https://en.wikipedia.org/wiki/Finite_element....
so it better using
wiki = dataframe1['Col1'][0]
another error is
return dataframe1,
it should be
return df,
fixed code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1['Col1'][0]
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns=['article_text'])
return df,
Thank you so much! it worked.
– user7434438
Nov 12 '18 at 16:08
ur welcome, please consider mark the answer correct.
– ewwink
Nov 12 '18 at 21:03
add a comment |
your dataframe1
is look like this
dataframe1 = 'Col1' : ['https://en.wikipedia.org/wiki/Finite_element_method']
the key is not index (int), but its 'Col1'
, you can fix it with
wiki = dataframe1['Col1'].to_string(index=0)
but it raise another error, the URL is trimmed if too long
https://en.wikipedia.org/wiki/Finite_element....
so it better using
wiki = dataframe1['Col1'][0]
another error is
return dataframe1,
it should be
return df,
fixed code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1['Col1'][0]
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns=['article_text'])
return df,
Thank you so much! it worked.
– user7434438
Nov 12 '18 at 16:08
ur welcome, please consider mark the answer correct.
– ewwink
Nov 12 '18 at 21:03
add a comment |
your dataframe1
is look like this
dataframe1 = 'Col1' : ['https://en.wikipedia.org/wiki/Finite_element_method']
the key is not index (int), but its 'Col1'
, you can fix it with
wiki = dataframe1['Col1'].to_string(index=0)
but it raise another error, the URL is trimmed if too long
https://en.wikipedia.org/wiki/Finite_element....
so it better using
wiki = dataframe1['Col1'][0]
another error is
return dataframe1,
it should be
return df,
fixed code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1['Col1'][0]
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns=['article_text'])
return df,
your dataframe1
is look like this
dataframe1 = 'Col1' : ['https://en.wikipedia.org/wiki/Finite_element_method']
the key is not index (int), but its 'Col1'
, you can fix it with
wiki = dataframe1['Col1'].to_string(index=0)
but it raise another error, the URL is trimmed if too long
https://en.wikipedia.org/wiki/Finite_element....
so it better using
wiki = dataframe1['Col1'][0]
another error is
return dataframe1,
it should be
return df,
fixed code
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
def azureml_main(dataframe1="https://en.wikipedia.org/wiki/Fluid_mechanics", dataframe2 = None):
wiki = dataframe1['Col1'][0]
page = ur.urlopen(wiki)
soup = BeautifulSoup(page)
df= pd.DataFrame([soup.find_all('p')[0].get_text()], columns=['article_text'])
return df,
edited Nov 12 '18 at 11:34
answered Nov 12 '18 at 11:14
ewwinkewwink
11.9k22239
11.9k22239
Thank you so much! it worked.
– user7434438
Nov 12 '18 at 16:08
ur welcome, please consider mark the answer correct.
– ewwink
Nov 12 '18 at 21:03
add a comment |
Thank you so much! it worked.
– user7434438
Nov 12 '18 at 16:08
ur welcome, please consider mark the answer correct.
– ewwink
Nov 12 '18 at 21:03
Thank you so much! it worked.
– user7434438
Nov 12 '18 at 16:08
Thank you so much! it worked.
– user7434438
Nov 12 '18 at 16:08
ur welcome, please consider mark the answer correct.
– ewwink
Nov 12 '18 at 21:03
ur welcome, please consider mark the answer correct.
– ewwink
Nov 12 '18 at 21:03
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53255586%2fexecuting-python-script-in-azure-ml-studio%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown