Troubleshooting Microsoft Defender for Business: Step-by-Step Guide

Microsoft Defender for Business is a security solution designed for small and medium businesses to protect against cyber threats. When issues arise, a systematic troubleshooting approach helps identify root causes and resolve problems efficiently. This guide provides a step-by-step process to troubleshoot common Defender for Business issues, highlights where to find relevant logs and alerts, and suggests advanced techniques for complex situations. All steps are factual and based on Microsoft’s latest guidance as of 2025.

Table of Contents

  • common-issues-and-symptoms
  • key-locations-for-logs-and-alerts
  • step-by-step-troubleshooting-process
    1. identify-the-issue-and-gather-information
    2. check-the-microsoft-365-defender-portal-for-alerts
    3. verify-device-status-and-protection-settings
    4. examine-device-logs-event-viewer
    5. resolve-configuration-or-policy-issues
    6. verify-issue-resolution
    7. escalate-to-advanced-troubleshooting-if-needed
  • advanced-troubleshooting-techniques
  • best-practices-to-prevent-future-issues
  • additional-resources-and-support

Common Issues and Symptoms

These are some typical problems administrators encounter with Defender for Business:

  • Setup and Onboarding Failures: The initial setup or device onboarding process fails. An error like “Something went wrong, and we couldn’t complete your setup” may appear, indicating a configuration channel or integration issue (often with Intune)[1]. Devices that should be onboarded don’t show up in the portal.
  • Devices Showing As Unprotected: In the Microsoft Defender portal, you might see notifications that certain devices are not protected even though they were onboarded[1]. This often happens when real-time protection is turned off (for instance, if a non-Microsoft antivirus is running, it may disable Microsoft Defender’s real-time protection).
  • Mobile Device Onboarding Issues: Users cannot onboard their iOS or Android devices using the Microsoft Defender app. A symptom is that mobile enrollment doesn’t complete, possibly due to provisioning not finished on the backend[1]. For example, if the portal shows a message “Hang on! We’re preparing new spaces for your data…”, it means the Defender for Business service is still provisioning mobile support (which can take up to 24 hours) and devices cannot be added until provisioning is complete[1].
  • Defender App Errors on Mobile: The Microsoft Defender app on mobile devices may crash or show errors. Users report issues like app not updating threats or not connecting. (Microsoft provides separate troubleshooting guides for the mobile Defender for Endpoint app on Android/iOS in such cases[1].)
  • Policy Conflicts: If you have multiple security management tools, you might see conflicting policies. For instance, an admin who was managing devices via Intune and then enabled Defender for Business’s simplified configuration could encounter conflicts where settings in Intune and Defender for Business overlap or contradict[1]. This can result in devices flipping between policy states or compliance errors.
  • Intune Integration Errors: During the setup process, an error indicating an integration issue between Defender for Business and Microsoft Intune might occur[1]. This often requires enabling certain settings (detailed in Step 5 below) to establish a proper configuration channel.
  • Onboarding or Reporting Delays: A device appears to onboard successfully but doesn’t show up in the portal or is missing from the device list even after some time. This could indicate a communication issue where the device is not reporting in. It might be caused by connectivity problems or by an issue with the Microsoft Defender for Endpoint service (sensor) on the device.
  • Performance or Scan Issues: (Less common with Defender for Business, but possible) – Devices might experience high CPU or scans get stuck, which could indicate an issue with Defender Antivirus on the endpoint that needs further diagnosis (this overlaps with Defender for Endpoint troubleshooting).

Understanding which of these scenarios matches your situation will guide where to look first. Next, we’ll cover where to find the logs and alerts that contain clues for diagnosis.


Key Locations for Logs and Alerts

Effective troubleshooting relies on checking both cloud portal alerts and on-device logs. Microsoft Defender for Business provides information in multiple places:

Microsoft 365 Defender Portal (security.microsoft.com): This is the cloud portal where Defender for Business is managed. The Incidents & alerts section is especially important. Here you can monitor all security incidents and alerts in one place[2]. For each alert, you can click to see details in a flyout pane – including the alert title, severity, affected assets (devices or users), and timestamps[2]. The portal often provides recommended actions or one-click remediation for certain alerts[2]. It’s the first place to check if you suspect Defender is detecting threats or if something triggered an alert that correlates with the issue.

Device Logs via Windows Event Viewer: On each Windows device protected by Defender for Business, Windows keeps local event logs for Defender components. Access these by opening Event Viewer (Start > eventvwr.msc). Key logs include:

  • Microsoft-Windows-SENSE/Operational – This log records events from the Defender for Endpoint sensor (“SENSE” is the internal code name for the sensor)[3]. If a device isn’t showing up in the portal or has onboarding issues, this log is crucial. It contains events for service start/stop, onboarding success/failure, and connectivity to the cloud. For example, Event ID 6 means the service isn’t onboarded (no onboarding info found), which indicates the device failed to onboard and needs the onboarding script rerun[3]. Event ID 3 means the service failed to start entirely[3], and Event ID 5 means it couldn’t connect to the cloud (network issue)[3]. We will discuss how to interpret and act on these later.
  • Windows Defender/Operational – This is the standard Windows Defender Antivirus log under Applications and Services Logs > Microsoft > Windows > Windows Defender > Operational. It logs malware detections and actions taken on the device[4]. For troubleshooting, this log is helpful if you suspect Defender’s real-time protection or scans are causing an issue or to confirm if a threat was detected on a device. You might see events like “Malware detected” (Event ID 1116) or “Malware action taken” (Event ID 1117) which correspond to threats found and actions (like quarantine) taken[4]. This can explain, for instance, if a file was blocked and that’s impacting a user’s work.
  • Other system logs: Standard Windows logs (System, Application) might also record errors (for example, if a service fails or crashes, or if there are network connectivity issues that could affect Defender).

Alerts in Microsoft 365 Defender: Defender for Business surfaces alerts in the portal for various issues, not only malware. For example, if real-time protection is turned off on a device, the portal will flag that device as not fully protected[1]. If a device hasn’t reported in for a long time, it might show in the device inventory with a stale last-seen timestamp. Additionally, if an advanced attack is detected, multiple alerts will be correlated as an incident; an incident might be tagged with “Attack disruption” if Defender automatically contained devices to stop the spread[2] – such context can validate if an ongoing security issue is causing what you’re observing.

Intune or Endpoint Manager (if applicable): Since Defender for Business can integrate with Intune (Endpoint Manager) for device management and policy deployment, some issues (especially around onboarding and policy conflicts) may require checking Intune logs:

  • In Intune admin center, review the device’s Enrollment status and Device configuration profiles (for instance, if a security profile failed to apply, it could cause Defender settings to not take effect).
  • Intune’s Troubleshooting + support blade for a device can show error codes if a policy (like onboarding profile) failed.
  • If there’s a known integration issue (like the one mentioned earlier), ensure the Intune connection and settings are enabled as described in the next sections.

Advanced Hunting and Audit (for advanced users): If you have access to Microsoft 365 Defender’s advanced hunting (which might require an upgraded license beyond Defender for Business’s standard features), you could query logs (e.g., DeviceEvents, AlertEvents) for deeper investigation. Also, the Audit Logs in the Defender portal record configuration changes (useful to see if someone changed a policy right before issues started).

Now, with an understanding of where to get information, let’s proceed with a systematic troubleshooting process.


Step-by-Step Troubleshooting Process

The following steps outline a logical process to troubleshoot issues in Microsoft Defender for Business. Adjust the steps as needed based on the specific symptoms you are encountering.

Step 1: Identify the Issue and Gather Information

Before jumping into configuration changes, clearly define the problem. Understanding the nature of the issue will focus your investigation:

  • What are the symptoms? For example, “Device X is not appearing in the Defender portal”, “Users are getting no protection on their phones”, or “We see an alert that one device isn’t protected”, etc.
  • When did it start? Did it coincide with any changes (onboarding new devices, changing policies, installing another antivirus, etc.)?
  • Who or what is affected? A single device, multiple devices, all mobile devices, a specific user?
  • Any error messages? Note any message in the portal or on the device. For instance, an error code during setup, or the portal banner saying “some devices aren’t protected”[1]. These messages often hint at the cause.

Gathering this context will guide you on where to look first. For example, an issue with one device might mean checking that device’s status and logs, whereas a widespread issue might suggest a configuration problem affecting many devices.

Step 2: Check the Microsoft 365 Defender Portal for Alerts

Log in to the Microsoft 365 Defender portal (https://security.microsoft.com) with appropriate admin credentials. This centralized portal often surfaces the problem:

  1. Go to Incidents & alerts: In the left navigation pane, click “Incidents & alerts”, then select “Alerts” (or “Incidents” for grouped alerts)[2]. Look for any recent alerts that correspond to your issue. For example, if a device isn’t protected or hasn’t reported, there may be an alert about that device.
  2. Review alert details: If you see relevant alerts, click on one to open the details flyout. Check the alert title and description – these describe what triggered it (e.g. “Real-time protection disabled on Device123” or “Malware detected and quarantined”). Note the severity (Informational, Low, Medium, High) and the affected device or user[2]. The portal will list the device name and perhaps the user associated with it.
  3. Take recommended actions: The alert flyout often includes recommended actions or a direct link to “Open incident page” or “Take action”. For instance, for a malware alert, it may suggest running a scan or isolating the device. For a configuration alert (like real-time protection off), it might recommend turning it back on. Make note of these suggestions as they directly address the issue described[2].
  4. Check the device inventory: Still in the Defender portal, navigate to Devices (under Assets). Find the device in question. The device page can show its onboarding status, last seen time, OS, and any outstanding issues. If the device is missing entirely, that confirms an onboarding problem – skip to Step 4 to troubleshoot that.
  5. **Inspect *Incidents***: If multiple alerts have been triggered around the same time or on the same device, the portal might have grouped them into an *Incident* (visible under the Incidents tab). Open the incident to see a timeline of what happened. This can give a broader context especially if a security threat is involved (e.g. an incident might show that a malware was detected and then real-time protection was turned off – indicating the malware might have attempted to disable Defender).

Example: Suppose the portal shows an alert “Real-time protection was turned off on DeviceXYZ”. This is a clear indicator – the device is onboarded but not actively protecting in real-time[1]. The recommended action would likely be to turn real-time protection back on. Alternatively, if an alert says “New malware found on DeviceXYZ”, you’d know the issue is a threat detection, and the portal might guide you to remediate or confirm that malware was handled. In both cases, you’ve gathered an essential clue before even touching the device.

If you do not see any alert or indicator in the portal related to your problem, the issue might not be something Defender is reporting on (for example, if the problem is an onboarding failure, there may not be an alert – the device just isn’t present at all). In such cases, proceed to the next steps.

Step 3: Verify Device Status and Protection Settings

Next, ensure that the devices in question are configured correctly and not in a state that would cause issues:

  1. Confirm onboarding completion: If a device doesn’t appear in the portal’s device list, ensure that the onboarding process was done on that device. Re-run the onboarding script or package on the device if needed. (Defender for Business devices are typically onboarded via the local script, Intune, Group Policy, etc. If this step wasn’t done or failed, the device won’t show up in the portal.)
  2. Check provisioning status for mobile: If the issue is with mobile devices (Android/iOS) not onboarding, verify that Defender for Business provisioning is complete. As mentioned, the portal (under Devices) might show a message “preparing new spaces for your data” if the service setup is still ongoing[1]. Provisioning can take up to 24 hours for a new tenant. If you see that message, the best course is to wait until it disappears (i.e., until provisioning finishes) before troubleshooting further. Once provisioning is done, the portal will prompt to onboard devices, and then users should be able to add their mobile devices normally[1].
  1. Verify real-time protection setting: On any Windows device showing “not protected” in the portal, log onto that device and open Windows Security > Virus & threat protection. Check if Real-time protection is on. If it’s off and cannot be turned on, check if another antivirus is installed. By design, onboarding a device running a third-party AV can cause Defender’s real-time protection to be automatically disabled to avoid conflict[1]. In Defender for Business, Microsoft expects Defender Antivirus to be active alongside the service for best protection (“better together” scenario)[1]. If a third-party AV is present, decide if you will remove it or live with Defender in passive mode (which reduces protection and triggers those alerts). Ideally, ensure Microsoft Defender Antivirus is enabled.
  2. Policy configuration review: If you suspect a policy conflict or misconfiguration, review the policies applied:
    • In the Microsoft 365 Defender portal, go to Endpoints > Settings > Rules & policies (or in Intune’s Endpoint security if that’s used). Ensure that you haven’t defined contradictory policies in multiple places. For example, if Intune had a policy disabling something but Defender for Business’s simplified setup has another setting, prefer one system. In a known scenario, an admin had Intune policies and then used the simplified Defender for Business policies concurrently, leading to conflicts[1]. The resolution was to delete or turn off the redundant policies in Intune and let Defender for Business policies take precedence (or vice versa) to eliminate conflicts[1].
    • Also verify tamper protection status – by default, tamper protection is on (preventing unauthorized changes to Defender settings). If someone turned it off for troubleshooting and forgot to re-enable, settings could be changed without notice.
  3. Intune onboarding profile (if applicable): If devices were onboarded via Intune (which should be the case if you connected Defender for Business with Intune), check the Endpoint security > Microsoft Defender for Endpoint section in Intune. Ensure there’s an onboarding profile and that those devices show as onboarded. If a device is stuck in a pending state, you may need to re-enroll or manually onboard.

By verifying these settings, you either fix simple oversights (like turning real-time protection back on) or gather evidence of a deeper issue (for example, confirming a device is properly onboarded, yet still not visible, implying a reporting issue, or confirming there’s a policy conflict that needs resolution in the next step).

Step 4: Examine Device Logs (Event Viewer)

If the issue is not yet resolved by the above steps, or if you need more insight into why something is wrong, dive into the device’s event logs for Microsoft Defender. Perform these checks on an affected device (or a sample of affected devices if multiple):

  1. Open Event Viewer (Local logs): On the Windows device, press Win + R, type eventvwr.msc and hit Enter. Navigate to Applications and Services Logs > Microsoft > Windows and scroll through the sub-folders.
  2. Check “SENSE” Operational log: Locate Microsoft > Windows > SENSE > Operational and click it to open the Microsoft Defender for Endpoint service log[3]. Look for recent Error or Warning events in the list:
    • Event ID 3: “Microsoft Defender for Endpoint service failed to start.” This means the sensor service didn’t fully start on boot[3]. Check if the Sense service is running (in Services.msc). If not, an OS issue or missing prerequisites might be at fault.
    • Event ID 5: “Failed to connect to the server at \.” This indicates the endpoint could not reach the Defender cloud service URLs[3]. This can be a network or proxy issue – ensure the device has internet access and that security.microsoft.com and related endpoints are not blocked by firewall or proxy.
    • Event ID 6: “Service isn’t onboarded and no onboarding parameters were found.” This tells us the device never got the onboarding info – effectively it’s not onboarded in the service[3]. Possibly the onboarding script never ran successfully. Solution: rerun onboarding and ensure it completes (the event will change to ID 11 on success).
    • Event ID 7: “Service failed to read onboarding parameters”[3] – similar to ID 6, means something went wrong reading the config. Redeploy the onboarding package.
    • Other SENSE events might point to registry permission issues or feature missing (e.g., Event ID 15 could mean the SENSE service couldn’t start due to ELAM driver off or missing components – those cases are rare on modern systems, but the event description will usually suggest enabling a feature or a Windows update[5][5]).
    Each event has a description. Compare the event’s description against Microsoft’s documentation for Defender for Endpoint event IDs to get specific guidance[3][3]. Many event descriptions (like examples above) already hint at the resolution (e.g., check connectivity, redeploy scripts, etc.).
  3. Check “Windows Defender” Operational log: Next, open Microsoft > Windows > Windows Defender > Operational. Look for recent entries, especially around the time the issue occurred:
    • If the issue is related to threat detection or a failed update, you might see events in the 1000-2000 range (these correspond to malware detection events and update events).
    • For example, Event ID 1116 (MALWAREPROTECTION_STATE_MALWARE_DETECTED) means malware was detected, and ID 1117 means an action was taken on malware[4]. These confirm whether Defender actually caught something malicious, which might have triggered further issues.
    • You might also see events indicating if the user or admin turned settings off. Event ID 5001-5004 range often relates to settings changes (like if real-time protection was disabled, it might log an event).
    The Windows Defender log is more about security events than errors; if your problem is purely a configuration or onboarding issue, this log might not show anything unusual. But it’s useful to confirm if, say, Defender is working up to the point of detecting threats or if it’s completely silent (which could mean it’s not running at all on that device).
  4. Additional log locations: If troubleshooting a device connectivity or performance issue, also check the System log in Event Viewer for any relevant entries (e.g., Service Control Manager errors if the Defender service failed repeatedly). Also, the Security log might show Audit failures if, for example, Defender attempted an action.
  5. Analyze patterns: If multiple devices have issues, compare logs. Are they all failing to contact the service (Event ID 5)? That could point to a common network issue. Are they all showing not onboarded (ID 6/7)? Maybe the onboarding instruction wasn’t applied to that group of devices or a script was misconfigured.

By scrutinizing Event Viewer, you gather concrete evidence of what’s happening at the device level. For instance, you might confirm “Device A isn’t in the portal because it has been failing to reach the Defender service due to proxy errors – as Event ID 5 shows.” Or “Device B had an event indicating onboarding never completed (Event 6), explaining why it’s missing from portal – need to re-onboard.” This will directly inform the fix.

Step 5: Resolve Configuration or Policy Issues

Armed with the information from the portal (Step 2), settings review (Step 3), and device logs (Step 4), you can now take targeted actions to fix the issue.

Depending on what you found, apply the relevant resolution below:

  • If Real-Time Protection Was Off: Re-enable it. In the Defender portal, ensure that your Next-generation protection policy has Real-time protection set to On. If a third-party antivirus is present and you want Defender active, consider uninstalling the third-party AV or check if it’s possible to run them side by side. Microsoft recommends using Defender AV alongside Defender for Business for optimal protection[1]. Once real-time protection is on, the portal should update and the “not protected” alert will clear.
  • If Devices Weren’t Onboarded Successfully: Re-initiate the onboarding:
    • For devices managed by Intune, you can trigger a re-enrollment or use the onboarding package again via a script/live response.
    • If using local scripts, run the onboarding script as Administrator on the PC. After running, check Event Viewer again for Event ID 11 (“Onboarding completed”)[3].
    • For any devices still not appearing, consider running the Microsoft Defender for Endpoint Client Analyzer on those machines – it’s a diagnostic tool that can identify issues (discussed in Advanced section).
  • If Event Logs Show Connectivity Errors (ID 5, 15): Ensure the device has internet access to Defender endpoints. Make sure no firewall is blocking:
    • URLs like *.security.microsoft.com, *windows.com related to Defender cloud. Proxy settings might need to allow the Defender service through. See Microsoft’s documentation on Defender for Endpoint network connections for required URLs.
    • After adjusting network settings, force the device to check in (you can reboot the device or restart the Sense service and watch Event Viewer to see if it connects successfully).
  • If Policy Conflicts are Detected: Decide on one policy source:
    • Option 1: Use Defender for Business’s simplified configuration exclusively. This means removing or disabling parallel Intune endpoint security policies that configure AV or Firewall or Device Security, to avoid overlap[1].
    • Option 2: Use Intune (Endpoint Manager) for all device security policies and avoid using the simplified settings in Defender for Business. In this case, go to the Defender portal settings and turn off the features you are managing elsewhere.
    • In practice, if you saw conflicts, a quick remedy is to delete duplicate policies. For example, if Intune had an Antivirus policy and Defender for Business also has one, pick one to keep. Microsoft’s guidance for a situation where an admin uses both was to delete existing Intune policies to resolve conflicts[1].
    • After aligning policies, give it some time for devices to update their policy and then check if the conflict alerts disappear.
  • If Integration with Intune Failed (Setup Error): Follow Microsoft’s recommended fix which involves three steps[1][1]:
    1. In the Defender for Business portal, go to Settings > Endpoints > Advanced Features and ensure Microsoft Intune connection is toggled On[1].
    2. Still under Settings > Endpoints, find Configuration management > Enforcement scope. Make sure Windows devices are selected to be managed by Defender for Endpoint (Defender for Business)[1]. This allows Defender to actually enforce policies on Windows clients.
    3. In the Intune (Microsoft Endpoint Manager) portal, navigate to Endpoint security > Microsoft Defender for Endpoint. Enable the setting “Allow Microsoft Defender for Endpoint to enforce Endpoint Security Configurations” (set to On)[1]. This allows Intune to hand off certain security configuration enforcement to Defender for Business’s authority. These steps establish the necessary channels so that Defender for Business and Intune work in harmony. After doing this, retry the setup or onboarding that failed. The previous error message about the configuration channel should not recur.
  • If Onboarding Still Fails or Device Shows Errors: If after trying to onboard, the device still logs errors like Event 7 or 15 indicating issues, consider these:
    • Run the onboarding with local admin rights (ensure no permission issues).
    • Update the device’s Windows to latest patches (sometimes older Windows builds have known issues resolved in updates).
    • As a last resort, you can try an alternate onboarding method (e.g., if script fails, try via Group Policy or vice versa).
    • Microsoft also suggests if Security Management (the feature that allows Defender for Business to manage devices without full Intune enrollment) is causing trouble, you can temporarily manually onboard the device to the full Defender for Endpoint service using a local script as a workaround[1]. Then offboard and try again once conditions are corrected.
  • If a Threat Was Detected (Malware Incident): Ensure it’s fully remediated:
    • In the portal, check the Action Center (there is an Action center in Defender portal under “Actions & submissions”) to see if there are pending remediation actions (like undo quarantine, etc.).
    • Run a full scan on the device through the portal or locally.
    • Once threats are removed, verify if any residual impact remains (e.g., sometimes malware can turn off services – ensure the Windows Security app shows all green).

Perform the relevant fixes and monitor the outcome. Many changes (policy changes, enabling features) may take effect within minutes, but some might take an hour or more to propagate to all devices. You can speed up policy application by instructing devices to sync with Intune (if managed) or simply rebooting them.

Step 6: Verify Issue Resolution

After applying fixes, confirm that the issue is resolved:

  • Check the portal again: Go back to the Microsoft 365 Defender portal’s Incidents & alerts and Devices pages.
    • If there was an alert (e.g., device not protected), it should now clear or show as Resolved. Many alerts auto-resolve once the condition is fixed (for instance, turning real-time protection on will clear that alert after the next device check-in).
    • If you removed conflicts or fixed onboarding, any incident or alert about those should disappear. The device should now appear in the Devices list if it was missing, and its status should be healthy (no warnings).
    • If a malware incident was being shown, ensure it’s marked Remediated or Mitigated. You might need to mark it as resolved if it doesn’t automatically.
  • Confirm on the device: For device-specific issues, physically check the device:
    • Open Windows Security and verify no warning icons are present.
    • In Event Viewer, see if new events are positive. For example, Event ID 11 in SENSE log (“Onboarding completed”) confirms success[3]. Or Event ID 1122 in Windows Defender log might show a threat was removed.
    • If you restarted services or the system, ensure they stay running (the Sense service should be running and set to automatic).
  • Test functionality: Perform a quick test relevant to the issue:
    • If mobile devices couldn’t onboard, try onboarding one now that provisioning is fixed.
    • If real-time protection was off, intentionally place a test EICAR anti-malware file on the machine to see if Defender catches it (it should, if real-time protection is truly working).
    • If devices were not reporting, force a machine to check in (by running MpCmdRun -SignatureUpdate to also check connectivity).
    • These tests confirm that not only is the specific symptom gone, but the underlying protection is functioning as expected.

If everything looks good, congratulations – the immediate issue is resolved. Make sure to document what the cause was and how it was fixed, for future reference.

Step 7: Escalate to Advanced Troubleshooting if Needed

If the problem persists despite the above steps, or if logs are pointing to something unclear, it may require advanced troubleshooting:

  • Multiple attempts failed? For example, if a device still won’t onboard after trying everything, or an alert keeps returning with no obvious cause, then it’s time to dig deeper.
  • Use the Microsoft Defender Client Analyzer: Microsoft provides a Client Analyzer tool for Defender for Endpoint that collects extensive logs and configurations. In a Defender for Business context, you can run this tool via a Live Response session. Live Response is a feature that lets you run commands on a remote device from the Defender portal (available if the device is onboarded). You can upload the Client Analyzer scripts and execute them to gather a diagnostic package[6][6]. This package can highlight misconfigurations or environmental issues. For Windows, the script MDELiveAnalyzer.ps1 (and related modules like MDELiveAnalyzerAV.ps1 for AV-specific logs) will produce a zip file with results[6][6]. Review its findings for any errors (or provide it to Microsoft support).
  • Enable Troubleshooting Mode (if performance issue): If the issue is performance-related (e.g., you suspect Defender’s antivirus is causing an application to crash or high CPU), Microsoft Defender for Endpoint has a Troubleshooting mode that can temporarily relax certain protections for testing. This is more applicable to Defender for Endpoint P2, but if accessible, enabling troubleshooting mode on a device allows you to see if the problem still occurs without certain protections, thereby identifying if Defender was the culprit. Remember to turn it off afterwards.
  • Consult Microsoft Documentation: Sometimes a specific error or event ID might be documented in Microsoft’s knowledge base. For instance, Microsoft has a page listing Defender Antivirus event IDs and common error codes – check those if you have a particular code.
  • Community and Support Forums: It can be useful to see if others have hit the same issue. The Microsoft Tech Community forums or sites like Reddit (e.g., r/DefenderATP) might have threads. (For example, missing incidents/alerts were discussed on forums and might simply be a UI issue or permission issue in some cases.)
  • Open a Support Case: When all else fails, engage Microsoft Support. Defender for Business is a paid service; you can open a ticket through your Microsoft 365 admin portal. Provide them with:
    • A description of the issue and steps you’ve taken.
    • Logs (Event Viewer exports, the Client Analyzer output).
    • Tenant ID and device details, if requested. Microsoft’s support can analyze backend data and guide further. They may identify if it’s a known bug or something environment-specific.

Escalating ensures that more complex or rare issues (like a service bug, or a weird compatibility issue) are handled by those with deeper insight or patching ability.


Advanced Troubleshooting Techniques

For administrators comfortable with deeper analysis, here are a few advanced techniques and tools to troubleshoot Defender for Business issues:

Advanced Hunting: This is a query-based hunting tool available in Microsoft 365 Defender. If your tenant has it, you can run Kusto-style queries to search for events. For example, to find all devices that had real-time protection off, you could query the DeviceHealthStatus table for that signal. Or search DeviceTimeline for specific event IDs across machines. It’s powerful for finding hidden patterns (like if a certain update caused multiple devices to onboard late or if a specific error code appears on many machines).

Audit Logs: Especially useful if the issue might be due to an admin change. The audit log will show events like policy changes, onboarding package generated, settings toggled, who did it and when. It helps answer “did anything change right before this issue?” For instance, if an admin offboarded devices by mistake, the audit log would show that.

Integrations and Log Forwarding: Many enterprises use a SIEM for unified logging. While Defender for Business is a more streamlined product, its data can be integrated into solutions like Sentinel (with some licensing caveats)[7]. Even without Sentinel, you could use Windows Event Forwarding to send important Defender events to a central server. That way, you can spot if all devices are throwing error X in their logs. This is beyond immediate troubleshooting, but helps in ongoing monitoring and advanced analysis.

Deep Configuration Checks: Sometimes group policies or registry values can interfere. Ensure no Group Policy is disabling Windows Defender (check Turn off Windows Defender Antivirus policy). Verify that the device’s time and region settings are correct (an odd one, but significant time skew can cause cloud communication issues).

Use Troubleshooting Mode: Microsoft introduced a troubleshooting mode for Defender which, when enabled on a device, disables certain protections for a short window so you can, for example, install software that was being blocked or see if performance improves. After testing, it auto-resets. This is advanced and should be used carefully, but it’s another tool in the toolbox.

Using these advanced techniques can provide deeper insights or confirm whether the issue lies within Defender for Business or outside of it (for example, a network device blocking traffic). Always ensure that after advanced troubleshooting, you return the system to a fully secured state (re-enable anything you turned off, etc.).


Best Practices to Prevent Future Issues

Prevention and proper management can reduce the likelihood of Defender for Business issues:

  • Keep Defender Components Updated: Microsoft Defender AV updates its engine and intelligence regularly (multiple times a day for threat definitions). Ensure your devices are getting these updates automatically (they usually do via Windows Update or Microsoft Update service). Also, keep the OS patched so that the Defender for Endpoint agent (built into Windows 10/11) is up-to-date. New updates often fix known bugs or improve stability.
  • Use a Single Source for Policy: Avoid mixing multiple security management platforms for the same settings. If you’re using Defender for Business’s built-in policies, try not to also set those via Intune or Group Policy. Conversely, if you require the advanced control of Intune, consider using Microsoft Defender for Endpoint Plan 1 or 2 with Intune instead of Defender for Business’s simplified model. Consistency prevents conflicts.
  • Monitor the Portal Regularly: Make it a routine to check the Defender portal’s dashboard or set up email notifications for high-severity alerts. Early detection of an issue (like devices being marked unhealthy or a series of failed updates) can let you address it before it becomes a larger problem.
  • Educate Users on Defender Apps: If your users install the Defender app on mobile, ensure they know how to keep it updated and what it should do. Sometimes user confusion (like ignoring the onboarding prompt or not granting the app permissions) can look like a “technical issue”. Provide a simple guide for them if needed.
  • Test Changes in a Pilot: If you plan to change configurations (e.g., enable a new attack surface reduction rule, or integrate with a new management tool), test with a small set of devices/users first. Make sure those pilot devices don’t report new issues before rolling out more broadly.
  • Use “Better Together” Features: Microsoft often touts “better together” benefits – for example, using Defender Antivirus with Defender for Business for coordinated protection[1]. Embrace these recommendations. Features like Automatic Attack Disruption will contain devices during a detected attack[2], but only if all parts of the stack are active. Understand what features are available in your SKU and use them; missing out on a feature could mean missing a warning sign that something’s wrong.
  • Maintain Proper Licensing: Defender for Business is targeted for up to 300 users. If your org grows or needs more advanced features, consider upgrading to Microsoft Defender for Endpoint plans. This ensures you’re not hitting any platform limits and you get features like advanced hunting, threat analytics, etc., which can actually make troubleshooting easier by providing more data.
  • Document and Share Knowledge: Keep an internal wiki or document for your IT team about past issues and fixes. For example, note down “In Aug 2025, devices had conflict because both Intune and Defender portal policies were applied – resolved by turning off Intune policy X.” This way, if something similar recurs or a new team member encounters it, the solution is readily available.

By following best practices, you reduce misconfigurations and are quicker to catch problems, making the overall experience with Microsoft Defender for Business smoother and more reliable.


Additional Resources and Support

For further information and help on Microsoft Defender for Business:

  • Official Microsoft Learn Documentation: Microsoft’s docs are very useful. The page “Microsoft Defender for Business troubleshooting” on Microsoft Learn covers many of the issues we discussed (setup failures, device protection, mobile onboarding, policy conflicts) with step-by-step guidance[1][1]. The “View and manage incidents in Defender for Business” page explains how to use the portal to handle alerts and incidents[2]. These should be your first reference for any new or unclear issues.
  • Microsoft Tech Community & Forums: The Defender for Business community forum is a great place to see if others have similar questions. Microsoft MVPs and engineers often post walkthroughs and answer questions. For example, blogs like Jeffrey Appel’s have detailed guides on Defender for Endpoint/Business features and troubleshooting (common deployment mistakes, troubleshooting modes, etc.)[8].
  • Support Tickets: As mentioned, don’t hesitate to use your support contract. Through the Microsoft 365 admin center, you can start a service request. Provide detailed info and severity (e.g., if a security feature is non-functional, treat it with high importance).
  • Training and Workshops: Microsoft occasionally offers workshops or webinars on their security products. These can provide deeper insight into using the product effectively (e.g., a session on “Managing alerts and incidents” or “Endpoint protection best practices”). Keep an eye on the Microsoft Security community for such opportunities.
  • Up-to-date Security Blog: Microsoft’s Security blog and announcements (for example, on the TechCommunity) can have news of new features or known issues. A recent blog might announce a new logging improvement or a known issue being fixed in the next update – which could be directly relevant to troubleshooting.

In summary, Microsoft Defender for Business is a powerful solution, and with the step-by-step approach above, you can systematically troubleshoot issues that come up. Starting from the portal’s alerts, verifying configurations, checking device logs, and then applying fixes will resolve most common problems. And for more complex cases, Microsoft’s support and documentation ecosystem is there to assist. By understanding where to find information (both in the product and in documentation), you’ll be well-equipped to keep your business devices secure and healthy.

References

[1] Microsoft Defender for Business troubleshooting

[2] View and manage incidents in Microsoft Defender for Business

[3] Review events and errors using Event Viewer

[4] windows 10 – How to find specifics of what Defender detected in real …

[5] Troubleshoot Microsoft Defender for Endpoint onboarding issues

[6] Collect support logs in Microsoft Defender for Endpoint using live …

[7] Microsoft 365 Defender for Business logs into Microsoft Sentinel

[8] Common mistakes during Microsoft Defender for Endpoint deployments

Troubleshooting Email Delivery Failures in Exchange Online (Internal to External)

Troubleshooting Email Delivery Failures in Exchange Online

bp1

When an internal user’s email to an external recipient fails to deliver, Exchange Online will usually return a Non-Delivery Report (NDR) (also called a bounce message) to the sender. This guide provides an easy step-by-step approach to identify common causes of such failures and resolve them. It includes troubleshooting steps for both users and administrators, as well as a reference of common NDR error codes and their meanings.

Common Causes of Email Delivery Failures

1. Incorrect Recipient Address: Typos or outdated email addresses are a frequent cause.

2. Mailbox/Server Issues

Full mailbox or server issues: The recipient’s mailbox might be full, or their mail server is temporarily unreachable.

3. Policy or Security Blocks

Blocked by rules or spam filters: Messages can be rejected due to sending limits, spam protection, or permission settings (e.g. not authorized to send to a group).

Common Reasons for Exchange Online Email Delivery Failures

    • Incorrect or Non-Existent Email Address: A simple typo or an address that doesn’t exist will cause a bounce. Exchange Online will report a bad destination mailbox address error if the address is incorrect. Always double-check that the recipient’s email is spelled correctly and is up-to-date.
    • Recipient’s Mailbox is Unavailable: If the external recipient’s mailbox is full, disabled, or non-operational, the message might not be delivered. A full mailbox or temporarily offline server causes a soft bounce, meaning the delivery failed temporarily. In such cases, you might receive an NDR indicating the mailbox can’t accept the message (e.g., mailbox quota exceeded).
    • External Server or DNS Issues: Sometimes the recipient’s email server isn’t reachable or their domain’s DNS records are misconfigured. Exchange Online could try resending for a period and eventually give up with an NDR like “Message expired” (after 24-48 hours) if the destination never responded. This often points to an issue on the receiving side (server down, incorrect MX records, etc.).
    • Sending Limits or Security Policies Triggered: Office 365 has sending limits and security measures. For example, if an account sends an unusually high volume of emails, it might be temporarily blocked for suspected spam (to protect the service). Also, if your organization or the recipient’s organization has policies (transport rules) restricting who can send to certain addresses (like distribution lists that only accept internal emails), your message can be rejected with an “authorized sender” error.
    • Spam or Filter Rejection: The email could be blocked by spam filters on either side. Exchange Online’s outbound filter might block content deemed spam or malicious, or the recipient’s email system might reject the message due to sender reputation, SPF/DKIM failures, or content. For example, an NDR with error code 5.7.23 indicates the recipient’s server rejected the mail because of an SPF check failure (your organization’s SPF record might be misconfigured). Similarly, the recipient’s server might block your organization’s email domain or IP if it’s on a blocklist.
    • Attachment Size or Type Issues: Sending very large attachments can lead to a bounce if the message exceeds size limits on the recipient’s end. Many email providers reject emails over a certain size. In such cases, you’d see an NDR indicating the message is too large. (For instance, a “552 5.3.4 message size limit exceeded” error). Likewise, certain attachment types might be blocked by security policies.

Understanding the reason behind the failure is key to resolution. The NDR received usually contains a status code and a brief explanation. Next, we’ll cover what steps an email sender (user) can take, followed by administrator-level diagnostics and fixes.


Step-by-Step Troubleshooting for Users

    • Step 1: Review the NDR (Bounce Message)

      When you receive a bounce email, read the User Information section. It often states what went wrong in plain language. For example, it might say “The email address you entered couldn’t be found” or “Message size exceeds limit.” Note any error codes (like 5.1.1 or 5.7.1) mentioned.

      Step 2: Verify the Recipient’s Email Address

      One of the first things to check is the recipient’s address. Make sure there are no typos and that the address is current. An NDR with code 5.1.1 or 5.1.10 usually means the address was not recognized by the destination server. If the address is incorrect, fix it and try sending again.

    • Step 3: Check for Attachment or Size Issues

      If your email had a large attachment or many recipients, consider the possibility that it was rejected due to size or distribution. Try sending a simpler email (e.g., just text, no attachments) to the same recipient. If that goes through, the original message may have been too large or triggered a limit. In case of large files, use a cloud sharing link instead of attachment.

    • Step 4: Read the NDR for Guidance

      NDR messages often include a “How to fix it” section with suggestions. For example, if the error was “recipient’s mailbox full,” the suggestion might be to wait until the recipient frees up space. If it says you’re not allowed to send to the recipient, it could be a policy issue (the recipient’s system rejects outside emails) – in that case, you may need to contact the recipient by other means to let them know, or have your administrator reach out to theirs.

    • Step 5: Try Sending Again or Later

      For transient problems (like a busy server or DNS issue), you might receive a delayed delivery notice first. If the NDR indicates a timeout or “message expired” (4.4.7), it suggests the recipient’s server couldn’t be reached in time. You can simply wait and try to resend later. Temporary glitches often get resolved, allowing a future attempt to succeed.

    • Step 6: Contact Your Administrator if It Persists

      If you’ve verified the address and retried, but the email still bounces (or the NDR suggests something you can’t fix, like “Access denied, bad outbound sender”), it’s time to involve your mail administrator or IT support. Provide them with the exact error message and code from the NDR – this information is crucial for deeper troubleshooting.

Tips for Users:

    • Use Outlook on the Web (OWA) for comparison: If you normally send email via Outlook desktop and suspect a client issue, try sending the email through Outlook on the Web. This helps rule out local configuration problems. (If it works on OWA, your Outlook app might need troubleshooting.)
    • Check Sent Items and Drafts: Ensure the message actually left your outbox. If it’s sitting in Drafts or Outbox, it may not have been sent at all (due to client-side issues). An NDR confirms the message did leave your mailbox but bounced back.
    • Look at NDR Details: In the bounce email, there is often a section “Diagnostic information for administrators” with technical details. While this is intended for IT staff, you can sometimes glean info like which server rejected the email and why. For instance, it may show the external server’s response like “550 5.7.1 SPF check failed” or “550 5.2.2 Mailbox full”. Don’t worry if it’s too technical – pass it to your admin.
    • Spam Content Check: If your email was bounced due to content (though rarely is it explicitly stated), consider if your message might have looked like spam (certain phrases or links). Adjusting the wording or removing suspicious attachments and trying again could help. (Your admin can confirm if your account was blocked for sending spam, which can happen if a mailbox is compromised.)

By following the above steps, many user-side issues can be resolved (especially address errors or message content issues). If not, the administrator will need to investigate further using admin tools.


Step-by-Step Troubleshooting for Administrators

Check Microsoft 365 Service Health: Before deep diving, ensure there isn’t a broad email service issue. Go to the Microsoft 365 Admin Center and check Service Health for Exchange Online. If there’s a known service degradation or outage affecting mail flow, Microsoft would be working on it, and that could explain external delivery issues. In such cases, advise users that service is degraded and monitor the health status.

    1. Use the Exchange Online Troubleshooter: Microsoft 365 provides an automated Email Delivery Troubleshooter for admins. In the Microsoft 365 Admin Center, navigate to the Troubleshooting or Support section and look for “Troubleshoot Email Delivery”. Enter the sender’s and recipient’s email addresses and run the tests. This diagnostic can catch common problems and misconfigurations and suggest fixes automatically.
    2. Run a Message Trace: The message trace tool is one of the most powerful ways to investigate mail flow. In the Exchange Admin Center (under Mail flow > Message trace), run a trace for the specific message or sender/recipient around the time of the issue. Look for the problematic message in the results:
      – If the trace shows the message was “Delivered” to the external party, then technically Exchange Online handed it off successfully. A delivered status means the issue might be on the recipient’s side (perhaps delivered to their spam folder or dropped by their server).
      – If the trace shows “Failed” or “Pending/Deferred”, examine the details. By selecting the message, you can see an explanation of what happened and a suggested “How to fix it” in many cases. The trace detail will include the SMTP status code and error text that the system encountered.
      – If no trace result is found, ensure you search the correct timeframe and that the email was sent as reported. (Trace by default covers the last 48 hours, but you can extend the range or run an extended trace for up to 90 days of history, though older traces come as a downloadable CSV.)
    3. Interpret the Error and NDR Code: Using the information from the message trace or the NDR (which the user hopefully provided), identify the error code and message. Refer to the Common NDR Error Codes section in this guide for quick insight. For a deep dive, Microsoft’s documentation lists many specific SMTP codes and their causes in Exchange Online. For example:
      Bad address (5.1.1): Likely user error – verify if the address exists.
      Relay or DNS failure (5.4.1, 4.4.7): Could be an external domain issue – you might need to check DNS or contact the recipient’s admin.
      Spam-related or blocked (5.1.8, 5.7.50x): The sending account might be compromised or was sending bulk mail. If so, Microsoft may have temporarily blocked the account from external sending. You should scan the user’s system for malware, reset their password (in case of compromise), and then use the Exchange admin center or Microsoft 365 security portal to remove any sending block on the account. Microsoft might require you to contact support to re-enable a banned sender.
      Not authorized (5.7.1, 5.7.133-134): This indicates the recipient’s side is rejecting the mail due to policy (maybe the recipient is a group that only accepts internal emails). In such cases, the solution lies with the recipient’s email administrator to allow external senders. As the sending admin, you may need to inform your user that the recipient must adjust their settings or provide an alternate contact method.
      Use Microsoft’s NDR diagnostic tool if needed: In the Microsoft 365 Admin Center, there’s a feature to input the NDR code for more info. It can give tailored guidance on that specific error (for instance, it might direct you to a knowledge article like “Fix error code 5.4.1” with detailed steps).
    4. Verify Your Organization’s Mail Settings: If many users experience external delivery issues, check if there’s any configuration on your side:
      Outbound Connectors: In Exchange Online, no connector is needed for general external sending (it uses Microsoft’s default route). However, if you have a Hybrid setup or use a third-party email gateway, an improperly configured Send Connector or partner connector could cause external delivery to fail. Validate connectors using the built-in tool or PowerShell. A misconfigured connector can result in “Relay Access Denied” errors or mail loops.
      Transport Rules: Review your mail flow rules to ensure none are unintentionally blocking or redirecting external emails. For instance, a rule that restricts external forwarding or adds headers shouldn’t stop delivery outright (unless misconfigured).
      DNS Records: Confirm that your organization’s DNS settings (MX, SPF, DKIM) are correct. While these primarily affect inbound mail and recipient-side processing, an incorrect SPF record can lead to external servers rejecting your messages (SPF hard fail). Make sure SPF includes all your sending IPs (Microsoft 365 and any other mail sources). An up-to-date SPF/DKIM/DMARC setup improves your chances of delivery and prevents rejections due to authentication failures.
    5. Check Sender’s Account Status: If the NDR or trace suggests the sender was blocked (for example, 5.1.8 Access denied, bad outbound sender or any 5.7.50x spam errors), go to the Security & Compliance Center (or Exchange admin security settings) and check for alerts about that mailbox. Microsoft 365 might have flagged the account for sending outbound spam. Remove any user from blocked senders list if present, after ensuring the account is secure. Also verify the user hasn’t hit any legitimate sending limits (e.g., trial tenants have low external recipient limits).
    6. Test and Follow Up: After any fixes (correcting addresses, adjusting configurations, unblocking accounts, etc.), have the user resend the email. Monitor the message trace again or ask the user to confirm if the email goes through. If the problem persists with a specific external domain despite everything on your side being normal, consider reaching out to the recipient’s mail administrator – their server may be rejecting your mails (the reason should be in the NDR). You can also attempt to send a test message from a different internal account to the same recipient to see if it’s a sender-specific issue or affects all senders in your org.
    7. Utilize Support Resources if Needed: If you’ve exhausted your troubleshooting and can’t identify the cause, you may open a support case with Microsoft. Provide them with message trace results and NDR details. Microsoft can help if it’s an issue on the Exchange Online side or give insight if your domain/IP is on any of their internal block lists beyond your control.

Common NDR Error Codes and What They Mean

When an email bounces, the NDR will include an SMTP status code (also known as an enhanced delivery status code). Below is a list of some common NDR codes in Exchange Online and their typical meaning:

NDR Code Description Meaning / Possible Cause
550 5.1.1 Bad destination mailbox address The recipient’s email address is invalid or not found. Often caused by typos or an address that no longer exists on the destination server. The sender should verify the address and try again.
550 5.1.10 Recipient not found Similar to 5.1.1 – the specified recipient’s address (particularly the domain) doesn’t exist in the recipient’s system. This can happen if the email was correct before but the external account was removed or changed. Double-check the address spelling and existence.
550 5.1.8 Access denied, bad outbound sender Exchange Online blocked the sender’s account from sending externally. This typically happens if the account was detected sending spam (possibly due to a compromised account). Admin intervention is required to secure and unblock the account.
550 5.2.2 Submission quota exceeded The sender has exceeded sending limits. Office 365 throttles users who send an unusually large number of messages or recipients in a short time. This is often a sign of a compromised account or an automated sending gone awry. The user should reduce sending volume and an admin may need to confirm the account’s security.
450 4.4.7 Message expired (Deferred) The message stayed in queue too long and timed out without reaching the recipient’s server. This is usually due to issues on the receiving side (server down, network issues, or misconfigured DNS). The sender can retry later; the admin should check that the target domain is reachable (DNS MX record, etc.).
550 5.4.1 Relay access denied / Domain not found The sending server wasn’t allowed to relay the message, or the recipient domain isn’t accepting mail. In Office 365, this can happen in hybrid setups or if the recipient’s domain has no valid mail exchanger. It may indicate a configuration issue either in connectors or on the recipient’s side (e.g., an MX record problem).
550 5.7.1 Delivery not authorized, message refused General unauthorized – The sender is not allowed to send to the recipient. Common causes: the recipient might be a distribution list or address restricted to internal senders, or a transport rule is blocking the message. For example, if you send to an external mailing list that only accepts members, you’ll get this error. Only the recipient’s admin can change this, or the sender must obtain permission.
550 5.7.1 (variant) Unable to relay Relay attempt failed – This occurs when a server tries to forward a message to another server and is not permitted. In a pure Exchange Online scenario, end-users shouldn’t normally see this unless an application or device is misconfigured. In hybrid scenarios, it can mean the on-premises server is not allowed to route outbound via Office 365 without authentication.
530 5.7.57 Client not authenticated The sending client/server did not authenticate where expected. This often appears when using SMTP submission (smtp.office365.com) from a device or app that didn’t properly authenticate. For user-sent mail via Exchange Online, this should not occur unless a connector is set incorrectly. The solution is to configure authentication or use the proper SMTP settings.
550 5.7.23 SPF validation failed The recipient’s email system rejected the message because it failed the SPF check. In other words, the sender’s domain isn’t authorized in DNS to send mail from the originating server. The admin should verify the SPF record for the sending domain includes all legitimate sending services and IPs.
550 5.7.501 (or 502/503) Access denied, spam abuse – banned sender Office 365 has banned the sender due to suspected spam. The account was likely sending out bulk or malicious emails. An admin needs to confirm the account is secure (change password, scan for malware) and then contact Microsoft support to re-enable sending.
550 5.7.506 Access Denied, Bad HELO The sending server introduced itself with an invalid HELO (typically by identifying as the recipient’s server). This is often seen as a spam characteristic. If your organization runs its own SMTP server or device, ensure its HELO/EHLO is properly configured to use its own domain name.
550 5.7.508 Rejected by recipient (IP blocked) The recipient’s organization blocked the sending IP address. This means your mail might be on a blocklist or the recipient explicitly blacklisted your domain/IP. The sender or admin would need to contact the recipient to get unblocked or request removal from blocklists.
552 5.3.4 Message size limit exceeded The email was too large for one of the mail systems. This error is often returned by the recipient’s server if the message size (including attachments) is over their limit. The solution is to reduce the size (compress files or use cloud sharing) and resend.

 

Note: The first digit of the status code indicates the type of failure. 4.x.x codes (e.g., 4.4.7) are temporary failures (the service will usually keep trying for some time), whereas 5.x.x codes (e.g., 5.1.1, 5.7.1) are permanent failures that require changes before reattempting. The examples above are some of the most encountered codes for internal-to-external mail issues. For a full list, see Microsoft’s documentation or use the admin center’s NDR diagnostic tool.


Tools and Best Practices for Preventing Delivery Issues

Maintaining smooth email delivery in Exchange Online involves proactive monitoring and configuration. Both users and admins can take preventive steps:

    • Keep Address Books Updated: Users should update contacts when people change addresses. Auto-complete (Outlook cache) can retain outdated addresses; removing old entries avoids misdirected emails.
    • Monitor Sending Limits: Educate users about sending limits (for example, Office 365 may limit an account to send to a large number of external recipients per day). Sudden need to email thousands of people can trigger throttling. Use distribution lists or third-party mailing services for bulk email to avoid hitting these limits.
    • Enable Authentication Protocols: Admins should ensure SPF, DKIM, and DMARC are properly set for the domain. These help recipient servers trust your emails and reduce bounces due to authentication failures. An SPF misconfiguration can lead to many bounces (5.7.23 errors) until fixed.
    • Regularly Check Blocked Senders: In the Exchange Admin Center, keep an eye on restricted users (accounts automatically blocked for sending spam). Microsoft 365 will list these in the Security portal. If an account is compromised, follow procedure to secure it and remove the block. This prevents a situation where a user is unaware their account was blocked (they’d get 5.1.8 NDRs until unblocked).
    • Use Message Encryption or Alternatives for Large Files: Instead of sending very large attachments, users can use OneDrive or SharePoint links. This avoids bouncing on size grounds and is more reliable. Also, if sending sensitive content, using Office 365 Message Encryption or a secure link can sometimes avoid content-based rejections by external filters.
    • Test DNS Changes: If you change your DNS records (like MX or SPF), test email flow. Admins can use tools like the Microsoft Remote Connectivity Analyzer to send test emails or verify DNS and mail flow between your org and the outside world. This can catch issues (e.g., missing MX or incorrect SPF) before they impact users.
    • Stay Informed on Service Status: Admins should subscribe to Office 365 Service Health alerts. In the Admin Center, the Service Health dashboard provides up-to-date info on any email service problems. Microsoft also posts alerts in the Message Center for configuration changes or known issues that could affect mail flow. Being aware early can save time troubleshooting something that is a broader cloud issue.
    • Educate Users on NDRs: Make sure end-users know that when they get a bounce message, they should read it and share it with IT if needed. NDRs are helpful – they often contain the reason for failure and sometimes even how to resolve it. Users should not ignore these or just repeatedly resend without addressing the error.
    • Maintain Good Sending Reputation: Avoid practices that can get your domain flagged as spam (like users sending phishing or too much marketing email from their regular accounts). If your organization needs to send bulk emails (newsletters, etc.), consider using dedicated services or distinct IP pools. A good reputation means external servers won’t block you as often, resulting in fewer bounces (less “550 5.7.508 rejected by recipient” situations).

Additional Resources and Support

If you need more help, here are resources and next steps:

    • Microsoft Support and Recovery Assistant (SaRA): Microsoft offers a Support and Recovery Assistant tool that end-users can run for Outlook and email issues. While it’s more often used for client issues (like not receiving emails in Outlook), it’s a good first step for a user to self-diagnose common problems.
    • Office 365 Community and Q&A: You can ask questions on Microsoft Q&A forums or the Tech Community for Exchange. Often, other admins have encountered similar issues (for example, specific NDR codes in hybrid setups) and can offer guidance.
    • Contacting Microsoft Support: For persistent or unclear issues, don’t hesitate to reach out to Microsoft 365 Support. Provide them with the NDR details, message trace results, and what troubleshooting you have done so far. They have deeper tools to investigate mail flow logs and can determine if the issue lies within Exchange Online or advise on external causes.
    • Staying Updated: Keep an eye on the Message Center in your M365 Admin portal for any updates related to mail flow, spam filtering changes, or new features that could affect how emails are delivered. Microsoft regularly updates Exchange Online, and new security features (like enhanced spam protections or stricter compliance rules) can sometimes lead to delivery questions – announcements in Message Center will prepare you for these.

By systematically following the steps in this guide, most internal-to-external email delivery problems can be identified and resolved. Remember to use the tools available (like message trace and NDR diagnostics) and leverage the error information provided. With careful verification of settings and attentive monitoring, you can ensure reliable email delivery for your organization’s users.

Adding Copilot button to desktop applications

Microsoft has just made Copilot for Microsoft 365 available for SMB customers:

Bringing the full power of Copilot to more people and businesses

So I went and signed up to get a look. Bottom line at the moment is that, yes you can buy a single license for a tenant with Business Premium but you need to pay for 12 months up front. Here in Australia than means around $600 inc GST upfront for a minimum 12 months. I have no doubt that I will get value but as yet there is no month by month option.

image

When I open my Microsoft 365 portal now I see a new Copilot icon as shown on the left.

image

When I select that icon, I am taken to a ChatGPT like screen as shown above that allows me to interact with my data in Microsoft 365 as expected.

image

If I open Word, Excel, Powerpoint, etc on the web I see a Copilot button in the ribbon as shown above.

image

However, I can’t see the expected Copilot button in any of desktop applications, like Word shown above.

This video from MVP Shane Young:

https://www.youtube.com/watch?v=KROOEdZXvoY

provided all the answers.

To see the Copilot button in Word, Excel, PowerPoint, OneNote (but not Outlook or Teams) you need to

image

go into each application individually (yes all of them one by one at this stage). Select File from the menu.

image

Then select Account from the option on the left.

image

On the right, I’d suggest that ensure your application is up to date first by selecting the Update Options button as shown above.

image

Now select the the Update License button as shown.

image

You’ll see the above dialog appear. Select Sign in and sign in using the account that has been assigned a Copilot for Microsoft 365 License.

image

You should see the above message indicating the process is complete. Select Close.

Now, Close and re-open the application.

image

Now the Copilot button should be visible and because this is Word you will also see the Draft with Copilot dialog as shown above confirming everything is enabled.

Remember, you’ll need to do this individually for each desktop application: Word, Excel, PowerPoint, OneNote following the same process.

Things are bit different for Outlook and Teams.

image

For Outlook you’ll need to switch over to the New Outlook by toggling the option in the top right corner of Outlook on the desktop as shown above.

image

As for Teams, you’ll just need to Sign Out and then Sign in with the account that has Copilot assigned.

image

image

you can also add the Copilot app on the left menu bar.

image

which will allow you access to the original ChatGPT style interface:

image

Don’t forget you can pin this ‘app’ to menu bar as well by right clicking on it.

All this an it is only day one with Copilot for Microsoft 365. Much to come. Stay tuned.

Issues with Microsoft Defender on iOS

I’m having issues with Microsoft Defender for iOS that I’m sharing here in case this may benefit others.

I think the root cause of the issue is that I have an EntraID account (production) and a Microsoft account (consumer) that are identical. One suggested solution is simply to rename the consumer account but I’d prefer not to do that if it can be avoided.

Here’s what typically happens:

image

My iOS device has Intune Company Portal App installed and I install Microsoft Defender manually from the iOS store. When I run Microsoft Defender I’m greeted by the screen above, which in this case only shows my consumer account.

image

The only option available is to sign up for a trial. This indicates that it doesn’t accept my production account which includes a license of Defender for Endpoint.

In other cases, I’ve see both my production and consumer account listed but it never seems to accept my production account when my consumer account is also present.

Interestingly, I get different results depending on whether I use an iPad or a iPhone.

On my iPad, I noted that I had both my production and consumer credentials in the Microsoft Authenticator app. I removed all the credentials so there was none. I reboot device, added ONLY my production credentials to the Microsoft Authenticator and then I was able to login to Microsoft Defender with my production account. Interestingly, this worked for a few days and then I had to repeat the process to get Microsoft Defender on my iPad logged back into my production credentials again.

The story is a little different on my iPhone. I didn’t want to remove my Microsoft Authenticator app but I did remove my consumer credentials from the Authenticator app, leaving just my production credential there. Even after a few reboots, I still wasn’t able to login to Microsoft Defender with my production account. Instead I logged into Microsoft Defender using a demo M365 E5 account I had. That allowed access and Defender was working.

A few days later, on my iPhone, Defender was asking for a login. I was now able to login with my production account and enable Defender correctly. However, I do notice that when I run Defender on the iPhone I see it switch out to Microsoft Authenticator and then switch back, as though it is checking my account. Since I have just managed to get Defender logged in on my iPhone with my production account I’ll need to see whether it ‘sticks’ or whether it prompts me to login again in the future.

In summary, as I said initially, the root of these issue come down to the fact that I have the same consumer and production identity and it seems Defender on iOS can’t differentiate. It also seems that Defender on iOS also interacts with Microsoft Authenticator in some way, also in different ways on an iPhone and iPad.

I’ll post more when I have done further testing.

PlatformIO code compiles but fails to execute when uploaded

image

This is a really silly one but it tripped me up for far longer than I care to admit. In essence, the issues I was having is that I would successfully compile and upload my code in PlatformIO but for some reason it wouldn’t execute on the device. I tried many, many things, including a complete re-install of the environment to no avail.

The issue was that I was placing my code file at the root of the project file structure (where the red X is above) and not in the SRC directory (where the green tick is above). Thus, when I created a new project using PlatformIO, it created a new empty main.cpp in the SRC directory and was actually compiling that and uploaded that to my device. Because this default template effectively did nothing, the device looked as though it wasn’t working.

Without knowing this, I had create my code in a file, also called main.cpp, but at the root of the project structure that was never being compiled and uploaded! D’Oh!

Once I had my code in the main.cpp file in the SRC directory, it uploaded and executed on the device as expected. I probably should have read the PlatformIO documentation first:

PlatformIO IDE for VSCode

alas, I didn’t and thereby wasted hours trying to work out what was wrong! I’m glad that I’ve now worked it out and I’m sharing just in case someone else has the same issue, as I did spend heaps of time searching for a solution and found none that pointed out my error of file location.

Need to Know podcast–Episode 201

We’ve recovered from our 200th episode and are getting back into the swing of our regular programming with some updates, information and opinions from the Microsoft Cloud. We cover some recent important updates, especially in the area of security, as well as some news around Microsoft 365 and Azure. We also dip our toes quickly into the area of certifications but we’ll need more time to do justice to the topic. So stay tuned for that episode coming real soon. For now, sit back and enjoy as we get back to what we like doing – keeping you up to date with everything that’s happening in the Microsoft Cloud.

Take a listen and let us know what you think – feedback@needtoknow.cloud

You can listen directly to this episode at:

https://ciaops.podbean.com/e/episode-201-back-to-normal/

Subscribe via iTunes at:

https://itunes.apple.com/au/podcast/ciaops-need-to-know-podcasts/id406891445?mt=2

The podcast is also available on Stitcher at:

http://www.stitcher.com/podcast/ciaops/need-to-know-podcast?refid=stpr

Don’t forget to give the show a rating as well as send us any feedback or suggestions you may have for the show.

Resources

@contactbrenton

@directorcia

CIAOPS Patron Program

Microsoft Cloud outage information

Duplicating a Microsoft Planner plan using PowerShell

GitHub and free access to private repositories

Office 365 will automatically block Flash and Silverlight

Azure AD makes sharing and collaboration seamless for any user account

Microsoft’s Cyber defense Operations Center shares best practices

Step 3 – Protect your identities. Top 10 actions to secure your environment

Get ready for the new Microsoft 365 Security Center and Microsoft 365 Compliance Center

Microsoft 365 NIST 800-53 action plan