Category : databricks

I got error in calculate date using pyspark.pandas. Is there any way to calcurate date with pyspark.padnas? import pyspark.pandas from datetime import timedelta df = pd.DataFrame({‘year’: [2015, 2016], ‘month’: [2, 3], ‘day’: [4, 5]}) df = pd.to_datetime(df) df + timedelta(days=3) # this yield same error df.add(timedelta(days=3)) this yeild this error TypeError: Addition can not be ..

Read more

I have uploaded my databricks notebooks to a repo and replace %run sentences with import using the new databrick public available features (Repo integration and python import): https://databricks.com/blog/2021/10/07/databricks-repos-is-now-generally-available.html But its seems its not working I already activate the repo integration option in the Admin panel but i Get this error ModuleNotFoundError: No module named ‘petitions’ ..

Read more

I have a column with a JSON array and I’m trying to create a new column with only a partial amount of the JSON plus some potential transforms on the json data. I’m using the following DataBricks page as a reference. https://docs.azuredatabricks.net/_static/notebooks/transform-complex-data-types-python.html ID js1 1 {"a":1, "b":1} And I want to return: ID js1 js2 ..

Read more

I’ve started to work with Databricks python notebooks recently and can’t understand how to read multiple .csv files from DBFS as I did in Jupyter notebooks earlier. I’ve tried: path = r’dbfs:/FileStore/shared_uploads/path/’ all_files = glob.glob(path + "/*.csv") li = [] for filename in all_files: df = pd.read_csv(filename, index_col=None, header=0, low_memory=False) li.append(df) data = pd.concat(li, axis=0, ..

Read more

I have this function which converts seconds to dd:hh:mm:ss (string) – however, when there is a null instance from the input column, I receive the error PythonException: ‘TypeError: unsupported operand type(s) for divmod(): ‘NoneType’ and ‘int”. is there a fix that can be put inside the function, below – def to_hms(s): m, s = divmod(s, ..

Read more

I am trying to get all the files and their subdirectories from a container in Azure storage account in a different subscription and the business requirement is to use the abfss url . abfss://@.dfs.core.windows.net//. I tried to import spark config for the subscription and used the below code to return the file list. Yet failed. ..

Read more