How to find Top features from Naive Bayes using sklearn pipeline

Hi all,

I am trying to apply Naive Bayes(MultinomialNB ) using pipelines and i came up with the code. However I am interested in finding top 10 positve and negative words , but not able to succeed. when I searched , I got the code for finding top features which i mentioned below. However when i tried using the code using pipeline i am getting the error which i mentioned below. I tried searching exhaustively , but got the code without using pipeline.But when i use the code with my output from pipeline, it is not working. COuld you please help me on how to find feature importance from pipeline output.

 # Pipeline dictionary
 pipelines = 
 'bow_MultinomialNB' : make_pipeline(
 CountVectorizer(), 
 preprocessing.Normalizer(),
 MultinomialNB()
 )
 


 # List tuneable hyperparameters of our pipeline
 pipelines['bow_MultinomialNB'].get_params()


 # BOW - MultinomialNB hyperparameters
 bow_MultinomialNB_hyperparameters = 
 'multinomialnb__alpha' : [1000,500,100,50,10,5,1,0.5,0.1,0.05,0.01,0.005,0.001,0.0005,0.0001]
 

 # Create hyperparameters dictionary
 hyperparameters = 
 'bow_MultinomialNB' : bow_MultinomialNB_hyperparameters
 


 tscv = TimeSeriesSplit(n_splits=3) #For time based splitting
 for name, pipeline in pipelines.items():
 print("NAME:",name)
 print("PIPELINE:",pipeline)


 %time
 # Create empty dictionary called fitted_models
 fitted_models = 

 # Loop through model pipelines, tuning each one and saving it to fitted_models
 for name, pipeline in pipelines.items():
 # Create cross-validation object from pipeline and hyperparameters

 model = GridSearchCV(pipeline, hyperparameters[name], cv=tscv, n_jobs=1,verbose=1)


 # Fit model on X_train, y_train

 model.fit(X_train, y_train)


 # Store model in fitted_models[name] 

 fitted_models[name] = model


 # Print 'name has been fitted'
 print(name, 'has been fitted.')

FEAURE IMPORTANCE:-

 pipelines['bow_MultinomialNB'].steps[2][1].classes__

 ---------------------------------------------------------------------------
 AttributeError Traceback (most recent call last)
 <ipython-input-125-7d45b007e86b> in <module>()
 ----> 1 pipelines['bow_MultinomialNB'].steps[2][1].classes_

 AttributeError: 'MultinomialNB' object has no attribute 'classes_'


 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()
 ---------------------------------------------------------------------------
 NotFittedError Traceback (most recent call last)
 <ipython-input-126-2883929221d1> in <module>()
 ----> 1 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()

 ~Anaconda3libsite-packagessklearnfeature_extractiontext.py in get_feature_names(self)
 958 def get_feature_names(self):
 959 """Array mapping from feature integer indices to feature name"""
 --> 960 self._check_vocabulary()
 961 
 962 return [t for t, i in sorted(six.iteritems(self.vocabulary_),

 ~Anaconda3libsite-packagessklearnfeature_extractiontext.py in _check_vocabulary(self)
 301 """Check if vocabulary is empty or missing (not fit-ed)"""
 302 msg = "%(name)s - Vocabulary wasn't fitted."
 --> 303 check_is_fitted(self, 'vocabulary_', msg=msg),
 304 
 305 if len(self.vocabulary_) == 0:

 ~Anaconda3libsite-packagessklearnutilsvalidation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
 766 
 767 if not all_or_any([hasattr(estimator, attr) for attr in attributes]):
 --> 768 raise NotFittedError(msg % 'name': type(estimator).__name__)
 769 
 770 

 NotFittedError: CountVectorizer - Vocabulary wasn't fitted.


 x=pipelines['bow_MultinomialNB'].steps[0][1]._validate_vocabulary()
 x.get_feature_names()

 ---------------------------------------------------------------------------
 AttributeError Traceback (most recent call last)
 <ipython-input-120-f620c754a34e> in <module>()
 ----> 1 x.get_feature_names()

 AttributeError: 'NoneType' object has no attribute 'get_feature_names'

Regards,
Shree

edited Nov 12 '18 at 2:18

asked Nov 11 '18 at 20:17

premgnc1983

1

Is there a reason you're looking at the pipelines object instead of the fitted model?

– Jarad
Nov 12 '18 at 3:38

Either way it did not work. Actually I am saving each fitted model as per following code. fitted_models[name] = model. I am just interested in getting to work those error lines

– premgnc1983
Nov 12 '18 at 12:46

add a comment |

How to find Top features from Naive Bayes using sklearn pipeline

Hi all,

 # Pipeline dictionary
 pipelines = 
 'bow_MultinomialNB' : make_pipeline(
 CountVectorizer(), 
 preprocessing.Normalizer(),
 MultinomialNB()
 )
 


 # List tuneable hyperparameters of our pipeline
 pipelines['bow_MultinomialNB'].get_params()


 # BOW - MultinomialNB hyperparameters
 bow_MultinomialNB_hyperparameters = 
 'multinomialnb__alpha' : [1000,500,100,50,10,5,1,0.5,0.1,0.05,0.01,0.005,0.001,0.0005,0.0001]
 

 # Create hyperparameters dictionary
 hyperparameters = 
 'bow_MultinomialNB' : bow_MultinomialNB_hyperparameters
 


 tscv = TimeSeriesSplit(n_splits=3) #For time based splitting
 for name, pipeline in pipelines.items():
 print("NAME:",name)
 print("PIPELINE:",pipeline)


 %time
 # Create empty dictionary called fitted_models
 fitted_models = 

 # Loop through model pipelines, tuning each one and saving it to fitted_models
 for name, pipeline in pipelines.items():
 # Create cross-validation object from pipeline and hyperparameters

 model = GridSearchCV(pipeline, hyperparameters[name], cv=tscv, n_jobs=1,verbose=1)


 # Fit model on X_train, y_train

 model.fit(X_train, y_train)


 # Store model in fitted_models[name] 

 fitted_models[name] = model


 # Print 'name has been fitted'
 print(name, 'has been fitted.')

FEAURE IMPORTANCE:-

 pipelines['bow_MultinomialNB'].steps[2][1].classes__

 ---------------------------------------------------------------------------
 AttributeError Traceback (most recent call last)
 <ipython-input-125-7d45b007e86b> in <module>()
 ----> 1 pipelines['bow_MultinomialNB'].steps[2][1].classes_

 AttributeError: 'MultinomialNB' object has no attribute 'classes_'


 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()
 ---------------------------------------------------------------------------
 NotFittedError Traceback (most recent call last)
 <ipython-input-126-2883929221d1> in <module>()
 ----> 1 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()

 ~Anaconda3libsite-packagessklearnfeature_extractiontext.py in get_feature_names(self)
 958 def get_feature_names(self):
 959 """Array mapping from feature integer indices to feature name"""
 --> 960 self._check_vocabulary()
 961 
 962 return [t for t, i in sorted(six.iteritems(self.vocabulary_),

 ~Anaconda3libsite-packagessklearnfeature_extractiontext.py in _check_vocabulary(self)
 301 """Check if vocabulary is empty or missing (not fit-ed)"""
 302 msg = "%(name)s - Vocabulary wasn't fitted."
 --> 303 check_is_fitted(self, 'vocabulary_', msg=msg),
 304 
 305 if len(self.vocabulary_) == 0:

 ~Anaconda3libsite-packagessklearnutilsvalidation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
 766 
 767 if not all_or_any([hasattr(estimator, attr) for attr in attributes]):
 --> 768 raise NotFittedError(msg % 'name': type(estimator).__name__)
 769 
 770 

 NotFittedError: CountVectorizer - Vocabulary wasn't fitted.


 x=pipelines['bow_MultinomialNB'].steps[0][1]._validate_vocabulary()
 x.get_feature_names()

 ---------------------------------------------------------------------------
 AttributeError Traceback (most recent call last)
 <ipython-input-120-f620c754a34e> in <module>()
 ----> 1 x.get_feature_names()

 AttributeError: 'NoneType' object has no attribute 'get_feature_names'

Regards,
Shree

edited Nov 12 '18 at 2:18

asked Nov 11 '18 at 20:17

premgnc1983

1

Is there a reason you're looking at the pipelines object instead of the fitted model?

– Jarad
Nov 12 '18 at 3:38

Either way it did not work. Actually I am saving each fitted model as per following code. fitted_models[name] = model. I am just interested in getting to work those error lines

– premgnc1983
Nov 12 '18 at 12:46

add a comment |

How to find Top features from Naive Bayes using sklearn pipeline

Hi all,

 # Pipeline dictionary
 pipelines = 
 'bow_MultinomialNB' : make_pipeline(
 CountVectorizer(), 
 preprocessing.Normalizer(),
 MultinomialNB()
 )
 


 # List tuneable hyperparameters of our pipeline
 pipelines['bow_MultinomialNB'].get_params()


 # BOW - MultinomialNB hyperparameters
 bow_MultinomialNB_hyperparameters = 
 'multinomialnb__alpha' : [1000,500,100,50,10,5,1,0.5,0.1,0.05,0.01,0.005,0.001,0.0005,0.0001]
 

 # Create hyperparameters dictionary
 hyperparameters = 
 'bow_MultinomialNB' : bow_MultinomialNB_hyperparameters
 


 tscv = TimeSeriesSplit(n_splits=3) #For time based splitting
 for name, pipeline in pipelines.items():
 print("NAME:",name)
 print("PIPELINE:",pipeline)


 %time
 # Create empty dictionary called fitted_models
 fitted_models = 

 # Loop through model pipelines, tuning each one and saving it to fitted_models
 for name, pipeline in pipelines.items():
 # Create cross-validation object from pipeline and hyperparameters

 model = GridSearchCV(pipeline, hyperparameters[name], cv=tscv, n_jobs=1,verbose=1)


 # Fit model on X_train, y_train

 model.fit(X_train, y_train)


 # Store model in fitted_models[name] 

 fitted_models[name] = model


 # Print 'name has been fitted'
 print(name, 'has been fitted.')

FEAURE IMPORTANCE:-

 pipelines['bow_MultinomialNB'].steps[2][1].classes__

 ---------------------------------------------------------------------------
 AttributeError Traceback (most recent call last)
 <ipython-input-125-7d45b007e86b> in <module>()
 ----> 1 pipelines['bow_MultinomialNB'].steps[2][1].classes_

 AttributeError: 'MultinomialNB' object has no attribute 'classes_'


 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()
 ---------------------------------------------------------------------------
 NotFittedError Traceback (most recent call last)
 <ipython-input-126-2883929221d1> in <module>()
 ----> 1 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()

 ~Anaconda3libsite-packagessklearnfeature_extractiontext.py in get_feature_names(self)
 958 def get_feature_names(self):
 959 """Array mapping from feature integer indices to feature name"""
 --> 960 self._check_vocabulary()
 961 
 962 return [t for t, i in sorted(six.iteritems(self.vocabulary_),

 ~Anaconda3libsite-packagessklearnfeature_extractiontext.py in _check_vocabulary(self)
 301 """Check if vocabulary is empty or missing (not fit-ed)"""
 302 msg = "%(name)s - Vocabulary wasn't fitted."
 --> 303 check_is_fitted(self, 'vocabulary_', msg=msg),
 304 
 305 if len(self.vocabulary_) == 0:

 ~Anaconda3libsite-packagessklearnutilsvalidation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
 766 
 767 if not all_or_any([hasattr(estimator, attr) for attr in attributes]):
 --> 768 raise NotFittedError(msg % 'name': type(estimator).__name__)
 769 
 770 

 NotFittedError: CountVectorizer - Vocabulary wasn't fitted.


 x=pipelines['bow_MultinomialNB'].steps[0][1]._validate_vocabulary()
 x.get_feature_names()

 ---------------------------------------------------------------------------
 AttributeError Traceback (most recent call last)
 <ipython-input-120-f620c754a34e> in <module>()
 ----> 1 x.get_feature_names()

 AttributeError: 'NoneType' object has no attribute 'get_feature_names'

Regards,
Shree

edited Nov 12 '18 at 2:18

asked Nov 11 '18 at 20:17

premgnc1983

How to find Top features from Naive Bayes using sklearn pipeline

Hi all,

 # Pipeline dictionary
 pipelines = 
 'bow_MultinomialNB' : make_pipeline(
 CountVectorizer(), 
 preprocessing.Normalizer(),
 MultinomialNB()
 )
 


 # List tuneable hyperparameters of our pipeline
 pipelines['bow_MultinomialNB'].get_params()


 # BOW - MultinomialNB hyperparameters
 bow_MultinomialNB_hyperparameters = 
 'multinomialnb__alpha' : [1000,500,100,50,10,5,1,0.5,0.1,0.05,0.01,0.005,0.001,0.0005,0.0001]
 

 # Create hyperparameters dictionary
 hyperparameters = 
 'bow_MultinomialNB' : bow_MultinomialNB_hyperparameters
 


 tscv = TimeSeriesSplit(n_splits=3) #For time based splitting
 for name, pipeline in pipelines.items():
 print("NAME:",name)
 print("PIPELINE:",pipeline)


 %time
 # Create empty dictionary called fitted_models
 fitted_models = 

 # Loop through model pipelines, tuning each one and saving it to fitted_models
 for name, pipeline in pipelines.items():
 # Create cross-validation object from pipeline and hyperparameters

 model = GridSearchCV(pipeline, hyperparameters[name], cv=tscv, n_jobs=1,verbose=1)


 # Fit model on X_train, y_train

 model.fit(X_train, y_train)


 # Store model in fitted_models[name] 

 fitted_models[name] = model


 # Print 'name has been fitted'
 print(name, 'has been fitted.')

FEAURE IMPORTANCE:-

 pipelines['bow_MultinomialNB'].steps[2][1].classes__

 ---------------------------------------------------------------------------
 AttributeError Traceback (most recent call last)
 <ipython-input-125-7d45b007e86b> in <module>()
 ----> 1 pipelines['bow_MultinomialNB'].steps[2][1].classes_

 AttributeError: 'MultinomialNB' object has no attribute 'classes_'


 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()
 ---------------------------------------------------------------------------
 NotFittedError Traceback (most recent call last)
 <ipython-input-126-2883929221d1> in <module>()
 ----> 1 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()

 ~Anaconda3libsite-packagessklearnfeature_extractiontext.py in get_feature_names(self)
 958 def get_feature_names(self):
 959 """Array mapping from feature integer indices to feature name"""
 --> 960 self._check_vocabulary()
 961 
 962 return [t for t, i in sorted(six.iteritems(self.vocabulary_),

 ~Anaconda3libsite-packagessklearnfeature_extractiontext.py in _check_vocabulary(self)
 301 """Check if vocabulary is empty or missing (not fit-ed)"""
 302 msg = "%(name)s - Vocabulary wasn't fitted."
 --> 303 check_is_fitted(self, 'vocabulary_', msg=msg),
 304 
 305 if len(self.vocabulary_) == 0:

 ~Anaconda3libsite-packagessklearnutilsvalidation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
 766 
 767 if not all_or_any([hasattr(estimator, attr) for attr in attributes]):
 --> 768 raise NotFittedError(msg % 'name': type(estimator).__name__)
 769 
 770 

 NotFittedError: CountVectorizer - Vocabulary wasn't fitted.


 x=pipelines['bow_MultinomialNB'].steps[0][1]._validate_vocabulary()
 x.get_feature_names()

 ---------------------------------------------------------------------------
 AttributeError Traceback (most recent call last)
 <ipython-input-120-f620c754a34e> in <module>()
 ----> 1 x.get_feature_names()

 AttributeError: 'NoneType' object has no attribute 'get_feature_names'

Regards,
Shree

scikit-learn pipeline feature-extraction naivebayes

edited Nov 12 '18 at 2:18

asked Nov 11 '18 at 20:17

premgnc1983

edited Nov 12 '18 at 2:18

asked Nov 11 '18 at 20:17

premgnc1983

edited Nov 12 '18 at 2:18

asked Nov 11 '18 at 20:17

premgnc1983

asked Nov 11 '18 at 20:17

premgnc1983

asked Nov 11 '18 at 20:17

premgnc1983

1

Is there a reason you're looking at the pipelines object instead of the fitted model?

– Jarad
Nov 12 '18 at 3:38

Either way it did not work. Actually I am saving each fitted model as per following code. fitted_models[name] = model. I am just interested in getting to work those error lines

– premgnc1983
Nov 12 '18 at 12:46

add a comment |

1

Is there a reason you're looking at the pipelines object instead of the fitted model?

– Jarad
Nov 12 '18 at 3:38

Either way it did not work. Actually I am saving each fitted model as per following code. fitted_models[name] = model. I am just interested in getting to work those error lines

– premgnc1983
Nov 12 '18 at 12:46

Is there a reason you're looking at the pipelines object instead of the fitted model?

– Jarad
Nov 12 '18 at 3:38

Either way it did not work. Actually I am saving each fitted model as per following code. fitted_models[name] = model. I am just interested in getting to work those error lines

– premgnc1983
Nov 12 '18 at 12:46

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53252832%2fhow-to-find-top-features-from-naive-bayes-using-sklearn-pipeline%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Dfyjkt