[← Back to Reviews Index](../Stewards%20Reviews%20Index.md)

# Security Review — Azure-AI-RAG-CSharp-Semantic-Kernel-Functions

| Field | Value |
|---|---|
| **Project** | Azure-AI-RAG-CSharp-Semantic-Kernel-Functions |
| **Review Date** | 2026-03-21 |
| **Steward** | Security Steward |
| **Layers Audited** | C# ASP.NET Core API, Python Azure Function, React frontend, Bicep infrastructure |
| **Critical** | 5 |
| **Notable** | 7 |
| **Minor** | 3 |
| **Info** | 2 |
| **Total** | 17 |

---

## 1. Security Posture Overview

The project is an Azure-hosted RAG chatbot consisting of four layers: a C# ASP.NET Core API (`ChatAPI`), a Python Azure Function (`DocumentLoaderFunction`), a React SPA frontend, and Bicep infrastructure definitions. The application uses Azure Managed Identity for most service-to-service authentication, which is a strong baseline. However, several significant security gaps exist across all four layers that must be addressed before this system is suitable for production exposure.

The most urgent concerns are:

- **No authentication or authorization** on any API endpoint — the chat and session endpoints are completely public.
- **Prompt injection** is possible: raw user input is inserted into AI prompts without any sanitization.
- **Unsanitized AI HTML output** is rendered directly in the browser using `html-react-parser`, creating a reflected XSS path through the AI model.
- **Swagger UI is unconditionally enabled** in all environments, including production, exposing the full API surface.
- **CORS is fully open** (`AllowAnyOrigin` + `AllowAnyMethod` + `AllowAnyHeader`), which is unnecessarily permissive.
- **Storage account** has `publicNetworkAccess: Enabled`, `allowSharedKeyAccess: true`, and three blob containers with no explicit `publicAccess: None` setting.
- **CosmosDB** has `disableLocalAuth: false` hardcoded in the account module, allowing key-based authentication even though Managed Identity is used.
- **Key Vault** is referenced (a `KeyVaultUri` app setting exists) but the infrastructure defines no Key Vault resource at all — secrets management is incomplete.
- **The Python function sets `OPENAI_API_KEY` in the process environment at runtime** from a live credential token, leaking the bearer token into `environ` where it could be read by any co-located code.

---

## 2. Secrets and Credentials Assessment

No hardcoded secrets, passwords, API keys, or connection strings were found committed to source control. The project correctly uses `DefaultAzureCredential` for Azure service authentication in both the C# API and the Python function, and environment variables / App Service configuration for all sensitive values.

One notable issue exists in the Python function:

```python
token = credential.get_token("https://cognitiveservices.azure.com/.default").token
environ["OPENAI_API_KEY"] = token
environ["AZURE_OPENAI_AD_TOKEN"] = environ["OPENAI_API_KEY"]
```

This writes a live bearer token into the process environment dictionary. While not a committed secret, it is an insecure credential-handling pattern — the token is written to a global mutable dictionary (`os.environ`) and stays there for the lifetime of the process, potentially readable by other code running in the same process or via environment variable logging. The C# API also passes `AZURE_OPENAI_API_KEY` as an environment variable to `AzureOpenAIEmbeddings`, but it is read from `environ.get()`, not hardcoded.

The `CosmosDb_ConnectionString` app setting in `api-app.bicep` is defined with an empty string (`value: ''`), meaning it must be set manually after deployment. This is a deployment gap — if the connection string is pasted in from the portal rather than fetched from Key Vault, it bypasses secrets management entirely.

**SEC-SECRET-001** — See Findings section.

---

## 3. Authentication and Authorization Assessment

### C# API (ChatAPI)

The API has **no authentication middleware** configured. `Program.cs` calls `app.UseAuthorization()` but never registers any authentication scheme (no `AddAuthentication`, no `AddJwtBearer`, no `AddMicrosoftIdentityWebApi`). `UseAuthorization()` without a corresponding `UseAuthentication()` call is effectively a no-op.

Neither `ChatController` nor `SessionController` carries any `[Authorize]` attribute. Both endpoints (`POST /chat` and `GET /session`) are publicly accessible to anyone on the internet, as the App Service has `publicNetworkAccess: Enabled`.

The `POST /chat` endpoint accepts a `sessionId` from the client without any validation or ownership check. A caller who knows (or can guess) any valid session ID can read and append to another user's chat history in CosmosDB.

**SEC-AUTH-001, SEC-AUTH-002, SEC-AUTH-003** — See Findings section.

### Python Function

The Azure Function is declared with `func.AuthLevel.ANONYMOUS`, which is appropriate for a blob-triggered function (it is not HTTP-triggered), so this is not a concern.

### React Frontend

The frontend has no authentication. There is no MSAL, no Azure AD login, no token acquisition, and no authorization header on API calls. This is consistent with the API having no auth — but both should be remediated together.

---

## 4. Input Validation and Injection Assessment

### Prompt Injection

The `ChatController` accepts a user-supplied `Input` string and passes it without any sanitization or filtering to `ChatService.GetResponseAsync()`, which inserts it directly into the Semantic Kernel `ChatHistory` as a user message. This user message is then submitted to GPT-4o via the Semantic Kernel completion service.

The system prompt in `Program.cs` attempts to limit the model's behavior through instructions, but prompt injection via crafted user input can bypass or override those instructions. There is no allowlist, blocklist, length limit, content filter check, or sanitization layer applied to the user's question before it reaches the AI model.

Similarly, in `AISearchDataPlugin`, the raw `question` string is used directly to generate embeddings and perform a search:

```csharp
var embedding = await _embedding.GenerateEmbeddingAsync(question);
var context = await _aisearchData.RetrieveDocumentationAsync(question, embedding);
```

There is no input length cap or content policy check.

### NoSQL Injection

The CosmosDB queries in `ChatHistoryData` and `ProductData` use parameterized `QueryDefinition` with `.WithParameter()`, which is the correct safe pattern. No string concatenation was found in any Cosmos query. No NoSQL injection risk.

### Session ID Validation

The `sessionId` parameter received from the client is passed directly to `GetMessagesBySessionIdAsync()` and used as a parameterized query value. The Cosmos query is parameterized, so this is safe from injection. However, there is no validation that the requesting user owns the session — see SEC-AUTH-003.

**SEC-INJECT-001** — See Findings section.

---

## 5. CORS Configuration Assessment

`Program.cs` defines a single CORS policy:

```csharp
options.AddPolicy("AllowAll",
    policy => policy.AllowAnyOrigin()
                    .AllowAnyHeader()
                    .AllowAnyMethod());
```

This policy is applied globally via `app.UseCors("AllowAll")`. `AllowAnyOrigin()` permits requests from every domain. While `AllowAnyOrigin()` and `AllowCredentials()` are not combined here (which would be invalid), the policy is still unnecessarily permissive for a production service. Any website can make cross-origin requests to the API.

For a production deployment, the CORS policy should be restricted to the specific origin(s) that host the React SPA (the `web-*.azurewebsites.net` domain).

**SEC-CORS-001** — See Findings section.

---

## 6. Security Headers Assessment

No security middleware for HTTP response headers is configured in `Program.cs`. The following headers are absent:

| Header | Status |
|---|---|
| `X-Content-Type-Options: nosniff` | Not set |
| `X-Frame-Options: DENY` | Not set |
| `Content-Security-Policy` | Not set |
| `Referrer-Policy` | Not set |
| `Permissions-Policy` | Not set |

HTTPS redirection is configured (`app.UseHttpsRedirection()`), which is positive.

Swagger UI is unconditionally enabled (the environment check is commented out):

```csharp
//if (app.Environment.IsDevelopment())
//{
app.UseSwagger();
app.UseSwaggerUI();
//}
```

This means the Swagger UI and OpenAPI JSON endpoint are live in production, fully documenting the API surface to any caller.

**SEC-HEADER-001, SEC-HEADER-002** — See Findings section.

---

## 7. Infrastructure Security Assessment

### Storage Account (`blob-storage-account.bicep`)

The storage account is configured with `allowSharedKeyAccess: true` and `publicNetworkAccess: Enabled`. Three of the four blob containers (`load`, `completed`, `images`) have no explicit `publicAccess` setting, which means they inherit the storage account default. Without `allowBlobPublicAccess: false` set at the account level, these containers could potentially be set to anonymous access by an operator. The `archive` container explicitly sets `publicAccess: 'None'`, which is correct, but the other three do not.

Additionally, the managed identity is granted three overlapping storage roles simultaneously: `Storage Blob Data Contributor`, `Storage Blob Data Owner`, and `Storage Account Contributor`. `Storage Blob Data Owner` (which includes full POSIX ACL control) subsumes `Storage Blob Data Contributor`. Granting all three violates the principle of least privilege.

**SEC-INFRA-001, SEC-INFRA-002** — See Findings section.

### CosmosDB (`account.bicep` / `nosql/account.bicep`)

The core CosmosDB account module sets `disableLocalAuth: false` in the resource definition, and the NoSQL wrapper module hard-codes the same:

```bicep
disableKeyBasedAuth: false
```

Even though the application uses Managed Identity for all Cosmos access, key-based authentication remains enabled at the account level. This means anyone with the primary or secondary key (visible in the Azure Portal) can access the database, bypassing Managed Identity entirely.

**SEC-INFRA-003** — See Findings section.

### Key Vault

The `infra/core/security/key-vault.bicep` file exists but contains only one line (it is essentially empty). No Key Vault resource is actually defined or deployed by the infrastructure. Yet `loader-function.bicep` defines a `KeyVaultUri` app setting with an empty value, indicating an intent to use Key Vault that was never completed. The `CosmosDb_ConnectionString` in `api-app.bicep` is also set to an empty string, presumably intended to come from Key Vault.

**SEC-INFRA-004** — See Findings section.

### AI Search

The AI Search service is deployed with `publicNetworkAccess: 'enabled'` (the default). The service is accessed via Managed Identity with `disableLocalAuth: true` in the main deployment (positive), but the underlying Bicep module accepts this as a parameter that could be overridden.

### App Services

Both the API App Service and the React web App Service have `publicNetworkAccess: 'Enabled'` with no IP restrictions or private endpoints. This is expected for a public-facing web application, but worth noting for the API which ideally should only be reachable from the frontend origin.

---

## 8. Dependency Security Assessment

### C# (NuGet)

| Package | Version | Assessment |
|---|---|---|
| `Azure.AI.OpenAI` | `2.1.0-beta.2` | Pre-release beta in production |
| `Azure.Search.Documents` | `11.6.0-beta.3` | Pre-release beta in production |
| `Microsoft.SemanticKernel` | `1.31.0` | Recent stable release, no known critical CVEs at audit date |
| `Azure.Identity` | `1.11.4` | Recent, no known critical CVEs |
| `Swashbuckle.AspNetCore` | `6.4.0` | Older; 6.9+ available; no critical CVEs but outdated |
| `Microsoft.Azure.Cosmos` | `3.39.1` | Recent stable release |

Using beta/pre-release packages (`-beta.x`) in production is a notable risk — these packages may have breaking changes or unresolved security issues without stable-channel patches.

### Python (`requirements.txt`)

| Package | Version | Assessment |
|---|---|---|
| `azure-identity` | `1.17.1` | Pinned, recent |
| `azure-core` | `1.29.0` | Pinned but not latest (1.32+ available); no critical CVEs known |
| `azure-search-documents` | `11.4.0` | Pinned but older; 11.6+ available |
| `langchain` | unpinned | Unpinned version — langchain has had multiple security advisories |
| `langchain-openai` | unpinned | Unpinned |
| `langchain-community` | unpinned | Unpinned |
| `azure-functions` | unpinned | Unpinned |
| `requests` | unpinned | Unpinned; `requests` has had CVEs in older versions |
| `beautifulsoup4` | unpinned | Unpinned |

Unpinned dependencies (`langchain`, `langchain-openai`, `langchain-community`, `requests`, etc.) in `requirements.txt` mean that any deployment could pull a version with a known vulnerability.

### React (`package.json`)

| Package | Version | Assessment |
|---|---|---|
| `react-scripts` | `5.0.1` | This is the Create React App toolchain — no longer actively maintained; contains known CVEs in transitive dependencies (webpack, etc.) |
| `html-react-parser` | `^5.1.10` | Recent; but its use is the more critical concern (see Frontend section) |
| `react` | `^18.3.1` | Current major; fine |

**SEC-DEP-001** — See Findings section.

---

## 9. Frontend Security Assessment

### Unsanitized AI HTML Output Rendered in Browser

`ChatLayout.js` uses `html-react-parser` to parse and render the AI model's response directly as React elements:

```javascript
import parse from 'html-react-parser'
// ...
<div>{parse(obj.message)}</div>
```

The AI model is instructed to return answers "wrapped in HTML tags for easy rendering" (per the system prompt in `Program.cs`). The React component parses this HTML and injects it into the DOM. If the AI model returns — or is manipulated via prompt injection to return — malicious HTML (e.g., `<script>`, `<img onerror=...>`, `<a href="javascript:...">`), it will be rendered in the user's browser.

`html-react-parser` does not sanitize HTML. It converts HTML strings to React elements, which React then renders. While React's JSX rendering mitigates some XSS vectors, `html-react-parser` with raw HTML from an untrusted source (an AI model that can be prompt-injected) is a meaningful XSS risk.

**SEC-FRONTEND-001** — See Findings section.

### API Host Configuration

The React app reads the API host from `process.env.REACT_APP_API_HOST`, which is set by the App Service configuration to `https://${appServiceNameAPI}.azurewebsites.net` (HTTPS). This is correct.

### No Authentication Tokens

The React app does not store any authentication tokens (no `localStorage`, no `sessionStorage` usage detected). The session ID from `GET /session` is stored only in React state (`useState`), which is appropriate.

### Console Logging of API Responses

`Agent.js` logs the raw API response to the browser console:

```javascript
console.log(res);
```

While not a direct security vulnerability, logging API responses in production leaks information about the response shape to anyone with browser DevTools.

---

## 10. Findings

| Severity | ID | Title | File |
|---|---|---|---|
| 🔴 Critical | SEC-AUTH-001 | No authentication configured on API | `src/ChatAPI/Program.cs` |
| 🔴 Critical | SEC-AUTH-002 | All API endpoints publicly accessible without authorization | `src/ChatAPI/Controllers/` |
| 🔴 Critical | SEC-INJECT-001 | Prompt injection: raw user input passed to AI model without sanitization | `src/ChatAPI/Services/ChatService.cs` |
| 🔴 Critical | SEC-FRONTEND-001 | Unsanitized AI HTML output rendered in browser via html-react-parser | `src/web/src/SupportAgent/ChatLayout.js` |
| 🔴 Critical | SEC-INFRA-003 | CosmosDB key-based authentication not disabled | `infra/core/database/cosmos-db/account.bicep` |
| 🟡 Notable | SEC-AUTH-003 | No session ownership validation — any caller can access any session | `src/ChatAPI/Controllers/ChatController.cs` |
| 🟡 Notable | SEC-CORS-001 | CORS policy allows all origins, all methods, all headers | `src/ChatAPI/Program.cs` |
| 🟡 Notable | SEC-HEADER-001 | Swagger UI unconditionally enabled in all environments including production | `src/ChatAPI/Program.cs` |
| 🟡 Notable | SEC-HEADER-002 | Missing security headers: X-Content-Type-Options, X-Frame-Options, CSP | `src/ChatAPI/Program.cs` |
| 🟡 Notable | SEC-INFRA-001 | Storage account has public network access enabled and shared key access allowed | `infra/core/storage/blob-storage-account.bicep` |
| 🟡 Notable | SEC-INFRA-004 | Key Vault not deployed; CosmosDB connection string set to empty in App Service config | `infra/core/security/key-vault.bicep`, `infra/app/api-app.bicep` |
| 🟡 Notable | SEC-DEP-001 | Unpinned Python dependencies and pre-release NuGet packages in production | `src/DocumentLoaderFunction/requirements.txt`, `src/ChatAPI/ChatAPI.csproj` |
| 🟡 Notable | SEC-SECRET-001 | Python function writes live bearer token to os.environ | `src/DocumentLoaderFunction/function_app.py` |
| 🟢 Minor | SEC-INFRA-002 | Overly permissive storage role assignments (three overlapping roles) | `infra/core/storage/blob-storage-account.bicep` |
| 🟢 Minor | SEC-INFRA-005 | Blob containers (load, completed, images) lack explicit publicAccess: None | `infra/core/storage/blob-storage-account.bicep` |
| 🟢 Minor | SEC-FRONTEND-002 | API response logged to browser console in production | `src/web/src/SupportAgent/Agent.js` |
| ℹ️ Info | SEC-INFO-001 | Managed Identity used consistently across all services — good baseline | Multiple infra files |
| ℹ️ Info | SEC-INFO-002 | CosmosDB queries use parameterized QueryDefinition — NoSQL injection risk mitigated | `src/ChatAPI/Data/ChatHistoryData.cs`, `ProductData.cs` |

---

## 11. Recommended Security Improvements (Prioritized by Risk)

| Finding | Recommended Action | Priority |
|---|---|---|
| SEC-AUTH-001 | Add `builder.Services.AddAuthentication().AddMicrosoftIdentityWebApi(...)` and `app.UseAuthentication()` to `Program.cs`. Integrate with Azure AD / Entra ID. | P0 — Must fix |
| SEC-AUTH-002 | Add `[Authorize]` to `ChatController` and `SessionController`. Define and enforce authorization policies. | P0 — Must fix |
| SEC-INJECT-001 | Validate and sanitize user input before inserting into AI prompts. Enforce a maximum input length (e.g., 2000 chars). Consider an Azure AI Content Safety check on user input before passing to the model. | P0 — Must fix |
| SEC-FRONTEND-001 | Replace `parse(obj.message)` with a sanitized render. Use `DOMPurify.sanitize()` on the HTML string before passing to `html-react-parser`, or avoid trusting AI-generated HTML entirely and use a markdown renderer instead. | P0 — Must fix |
| SEC-INFRA-003 | Set `disableLocalAuth: true` in `infra/core/database/cosmos-db/account.bicep` to disable CosmosDB key-based auth and enforce Managed Identity only. | P0 — Must fix |
| SEC-AUTH-003 | Bind `sessionId` to the authenticated user's identity. Validate in `ChatController` that the `sessionId` belongs to the authenticated caller before querying CosmosDB. | P1 — Fix soon |
| SEC-CORS-001 | Replace `AllowAnyOrigin()` with `.WithOrigins("https://web-*.azurewebsites.net")` or the specific frontend URL. | P1 — Fix soon |
| SEC-HEADER-001 | Restore the environment check: only enable Swagger in Development (`if (app.Environment.IsDevelopment())`). | P1 — Fix soon |
| SEC-HEADER-002 | Add security headers middleware. Use `app.UseHsts()` in production, and add a response header middleware or NuGet package (e.g., `NetEscapades.AspNetCore.SecurityHeaders`) to set `X-Content-Type-Options`, `X-Frame-Options`, and a `Content-Security-Policy`. | P1 — Fix soon |
| SEC-INFRA-001 | Set `allowBlobPublicAccess: false` and consider restricting `publicNetworkAccess` to specific VNET ranges or using private endpoints for the storage account. | P1 — Fix soon |
| SEC-INFRA-004 | Complete the Key Vault deployment: add a Key Vault resource to `infra/core/security/key-vault.bicep`, enable soft-delete and purge protection, store the CosmosDB connection string as a Key Vault secret, and reference it via Key Vault reference syntax in the App Service configuration. | P1 — Fix soon |
| SEC-SECRET-001 | Refactor `function_app.py` to pass the credential object to the `AzureOpenAIEmbeddings` constructor using the `azure_ad_token_provider` parameter (a callable that returns a fresh token) rather than writing the raw token to `os.environ`. | P1 — Fix soon |
| SEC-DEP-001 | Pin all Python dependencies to specific tested versions. Replace pre-release NuGet packages (`Azure.AI.OpenAI 2.1.0-beta.2`, `Azure.Search.Documents 11.6.0-beta.3`) with their latest stable equivalents when available. | P2 — Fix when convenient |
| SEC-INFRA-002 | Remove the `Storage Blob Data Contributor` and `Storage Account Contributor` role assignments, retaining only `Storage Blob Data Owner` (or use the minimum role actually needed by each consuming service). | P2 — Fix when convenient |
| SEC-INFRA-005 | Explicitly set `publicAccess: 'None'` on the `load`, `completed`, and `images` containers. | P2 — Fix when convenient |
| SEC-FRONTEND-002 | Remove `console.log(res)` from `Agent.js` production builds, or gate it with a `process.env.NODE_ENV === 'development'` check. | P3 — Fix when convenient |

---

## Footer

> This review is based entirely on static analysis of source files, configuration files, and infrastructure definitions. No runtime testing, penetration testing, dynamic analysis, or dependency vulnerability scanning (npm audit, dotnet list package --vulnerable) was performed. Actual exploitability of findings may vary based on deployment configuration not visible in source files.
>
> Generated by: Security Steward (`security-steward.md`) — run date 2026-03-21.
