View Categories

DataBricks

6 min read

%fs mounts
%fs ls /mnt/path
unmount /mnt/mountpath

Access key – Mount ADLS #

dbutils.fs.mount(
  source="abfss://<container-name>@<your-storage-account-name>.dfs.core.windows.net/",
  mount_point="/mnt/<mount-point-name>",
  extra_configs={
    "fs.azure.account.key.<your-storage-account-name>.dfs.core.windows.net": "<your-access-key>"
  }
)

SAS key – Mount ADLS #

dbutils.fs.mount(
  source="abfss://<container-name>@<your-storage-account-name>.dfs.core.windows.net/",
  mount_point="/mnt/<mount-point-name>",
  extra_configs={
    "fs.azure.sas.<your-container-name>.<your-storage-account-name>.dfs.core.windows.net": "<your-sas-token>"
  }
)
)

#

Mount Data Lake Storage Gen2 using Service Principal credentials #

# Service Principal credentials

client_id = "<your-service-principal-client-id>"
client_secret = "<your-service-principal-client-secret>"
tenant_id = "<your-tenant-id>"

# Mount Data Lake Storage Gen2 using Service Principal credentials
dbutils.fs.mount(
  source="abfss://<container-name>@<your-storage-account-name>.dfs.core.windows.net/",
  mount_point="/mnt/<mount-point-name>",
  extra_configs={
    "fs.azure.account.auth.type.<your-storage-account-name>.dfs.core.windows.net": "OAuth",
    "fs.azure.account.oauth.provider.type.<your-storage-account-name>.dfs.core.windows.net": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
    "fs.azure.account.oauth2.client.id.<your-storage-account-name>.dfs.core.windows.net": client_id,
    "fs.azure.account.oauth2.client.secret.<your-storage-account-name>.dfs.core.windows.net": client_secret,
    "fs.azure.account.oauth2.client.endpoint.<your-storage-account-name>.dfs.core.windows.net": f"https://login.microsoftonline.com/{tenant_id}/oauth2/token"
  }
)

Configure Azure Key Vault secrets #

kv_scope = “”
client_id_secret_name = “”
client_secret_secret_name = “”

DLS-ServicePrincial-Keyvualt #

0_permissions on storage account to Service principal
Storage blob data contributor role to service principal on Storage account level


0_Persmissions from "databricks managed identity" to Keyvault 
get, list secrets permissions 

0_storage secrets 
client_id_secret_name: id
client_secret_secret_name: secret



#1_create secret scope
https://<databricks-instance>/#secrets/createScope
Scope name : kv_scope
Manage princiapal : Creater/ all users

Azure Keyvualt
DNS name : https://xxx.vault.azure.net
Resource ID : /subscriptions/xxxxxx/...


#2_Databricks cli configuration
install python & pip
python --version 
pip --version 
pip install databricks-cli
databricks --version

#3_Databricks CLI Configure 
databricks configure --token --url https://<your-workspace-url> --token <your-token>
C:\users\username\.databrickscfg
[DEFAULT]
host = https://<your-workspace-url>
token = <your-access-token>


#4_review scope
databricks secrets list-scopes
databricks secrets list --scope <scope-name>


# Mount ADLS with service principal with keyvault

client_id = dbutils.secrets.get(scope=kv_scope, key=client_id_secret_name)
client_secret = dbutils.secrets.get(scope=kv_scope, key=client_secret_secret_name)

dbutils.fs.mount(
  source=f"abfss://<container-name>@<your-storage-account-name>.dfs.core.windows.net/",
  mount_point="/mnt/<mount-point-name>",
  extra_configs={
    "fs.azure.account.auth.type.<your-storage-account-name>.dfs.core.windows.net": "OAuth",
    "fs.azure.account.oauth.provider.type.<your-storage-account-name>.dfs.core.windows.net": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
    "fs.azure.account.oauth2.client.id.<your-storage-account-name>.dfs.core.windows.net": client_id,
    "fs.azure.account.oauth2.client.secret.<your-storage-account-name>.dfs.core.windows.net": client_secret,
    "fs.azure.account.oauth2.client.endpoint.<your-storage-account-name>.dfs.core.windows.net": "https://login.microsoftonline.com/<your-tenant-id>/oauth2/token"
  }
)

Unity Catelog Metastore Configuration #

Permissions
Role : Blob data contributor on datalake gen2 storage account level to the Databricks Access connector.

User should have global administrator rights
Deploy Databricks with premium account
login https://accounts.azuredatabricks.net/ to get azure databricks account access

Go to Data > Metastore > Create metastore
Name : sample-metastore
Region : eastus (adb workspace should be on eastus and adls could be any location)
ADLS Gen2 Path : @.dfs.core.windows.net/
Access connection ID : get the id from access connector (will be created along with adb premium account)

SCIM integration #

Create link with workspace

login https://accounts.azuredatabricks.net/ to get azure databricks account access
go to settings > generate token

SCIM Token : dsapi23c3279bf840ffacf3beb976d59sdfw2
Account SCIM URL : https://accounts.azuredatabricks.net/api/2.0/accounts//scim/v2

Go to enterprise app registration

Provisioning : SCIM Token & Account SCIM URL
User & Group : Add users & groups & Service principals to push to Databricks account (Will be pushed to account every 40 Min)

Users roles can be managed by databricks account centrally, Roles : Account admin, Market place Admin

1. Introduction to Azure Databricks

2. Create an Azure Databricks Workspace using Azure Portal

3. Create Databricks Community Edition Account

4. Workspace in Azure Databricks

5. Workspace assets in Azure Databricks

6.Working with Workspace Objects in Azure Databricks

7. Create and Run Spark Job in Databricks

8. Azure Databricks architecture overview

9. Databricks File System(DBFS) overview in Azure Databricks

10. Databricks Utilities(dbutils) in Azure Databricks

11. Data Utility(dbutils.data) in Azure Databricks in Databricks utilities

12. File System utility(dbutils.fs) of Databricks Utilities in Azure Databricks

13. exit() command of notebook utility(dbutils.notebook) in Azure Databricks

14. run() command of notebook utility(dbutils.notebook) in Databricks Utilities in Azure Databricks

15. Widgets utility(dbutils.widgets) of Databricks Utilities in Azure Databricks

16. Pass values to notebook parameters from another notebook using run command in Azure Databricks

17. Parameterize SQL notebook using widgets in Azure Databricks | Widgets in SQL in Azure Databricks

18. Create Mount point using dbutils.fs.mount() in Azure Databricks

19. Mount Azure Blob Storage to DBFS in Azure Databricks

20. Delete or Unmount Mount Points in Azure Databricks

21. mounts() & refreshMounts() commands of File system Utilities in Azure Databricks

22. Update Mount Point(dbutils.fs.updateMount()) in Azure Databricks

23. Secret Scopes Overview in Azure Databricks

24. Install Databricks CLI and configure with your workspace | Azure Databricks

25. Create an Azure Key Vault backed secret scope using the UI in Azure Databricks

26. Create a Databricks backed secret scope in Azure Databricks

27. Secrets Utility(dbutils.secrets) of Databricks Utilities in Azure Databricks

28. Access ADLS Gen2 storage using Account Key in Azure Databricks

29. Configure access to Azure storage with an Azure Active Directory service principal

30.Access Data Lake Storage Gen2 or Blob Storage with an Azure service principal in Azure Databricks

31. Access ADLS Gen2 or Blob Storage using a SAS token in Azure Databricks

spark.conf.set(“fs.azure.account.auth.type.storageac.dfs.core.windows.net”, “SAS”)

spark.conf.set(“fs.azure.sas.token.provider.type.storageac.dfs.core.windows.net”), “org.apache.hadoop.fs.azurebfs.sas.FIxedSASTokenProvider”)

spark.conf.set(“fs.azure.sas.fixed.token.storageac.dfs.corewindows.net”, “<token>”)

Powered by BetterDocs

Leave a Reply

Your email address will not be published. Required fields are marked *