This is the code cell from my program where I am facing an error. Dataset used for the program is twitter.csv.
x = np.array(df["tweet"])
y = np.array(df["labels"])
cv = CountVectorizer()
x = cv.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size= 0.25, random_state= 42)
clf = DecisionTreeClassifier()
clf.fit(x_train,y_train)
Error occured is:
ValueError Traceback (most recent call last)
Cell In[52], line 8
6 x_train, x_test, y_train, y_test = train_test_split(x,y, test_size= 0.25, random_state= 42)
7 clf = DecisionTreeClassifier()
----> 8 clf.fit(x_train,y_train)
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\sklearn\tree\_classes.py:889, in DecisionTreeClassifier.fit(self, X, y, sample_weight, check_input)
859 def fit(self, X, y, sample_weight=None, check_input=True):
860 """Build a decision tree classifier from the training set (X, y).
861
862 Parameters
(...)
886 Fitted estimator.
887 """
--> 889 super().fit(
890 X,
891 y,
892 sample_weight=sample_weight,
893 check_input=check_input,
894 )
895 return self
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\sklearn\tree\_classes.py:186, in BaseDecisionTree.fit(self, X, y, sample_weight, check_input)
184 check_X_params = dict(dtype=DTYPE, accept_sparse="csc")
...
--> 111 raise ValueError("Input contains NaN")
113 # We need only consider float arrays, hence can early return for all else.
114 if X.dtype.kind not in "fc":
ValueError: Input contains NaN
As it is shown as input contains NaN. And I tried some of the methods shown online like fill(0)
, but it did not work.
What changes do I need to do to clear this error?
source https://stackoverflow.com/questions/76301529/python-valueerror-input-contains-nan
Comments
Post a Comment