Executive Summary
The inquiry regarding the website ownership attestation in Microsoft Copilot Studio, specifically when adding public websites as knowledge sources, points to a profoundly real and critical concern for organizations. This attestation is not a mere procedural step but a pivotal declaration that directly impacts an organization’s legal liability, particularly concerning intellectual property rights and adherence to website terms of service.
The core understanding is that this attestation is intrinsically linked to how Copilot Studio agents leverage Bing to search and retrieve information from public websites designated as knowledge sources.1 Utilizing public websites that an organization does not own as knowledge sources, especially without explicit permission or a valid license, introduces substantial legal risks, including potential copyright infringement and breaches of contractual terms of service.3 A critical point of consideration is that while Microsoft offers a Customer Copyright Commitment (CCC) for Copilot Studio, this commitment explicitly excludes components powered by Bing.6 This exclusion places the full burden of compliance and associated legal responsibility squarely on the user. Therefore, organizations must implement robust internal policies, conduct thorough due diligence on external data sources, and effectively utilize Copilot Studio’s administrative controls, such as Data Loss Prevention (DLP) policies, to mitigate these significant risks.
1. Understanding Knowledge Sources in Microsoft Copilot Studio
Overview of Copilot Studio’s Generative AI Capabilities
Microsoft Copilot Studio offers a low-code, graphical interface designed for the creation of AI-powered agents, often referred to as copilots.7 These agents are engineered to facilitate interactions with both customers and employees across a diverse array of channels, including websites, mobile applications, and Microsoft Teams.7 Their primary function is to efficiently retrieve information, execute actions, and deliver pertinent insights by harnessing the power of large language models (LLMs) and advanced generative AI capabilities.1
The versatility of these agents is enhanced by their ability to integrate various knowledge sources. These sources can encompass internal enterprise data from platforms such as Power Platform, Dynamics 365, SharePoint, and Dataverse, as well as uploaded proprietary files.1 Crucially, Copilot Studio agents can also draw information from external systems, including public websites.1 The generative answers feature within Copilot Studio is designed to serve as either a primary information retrieval mechanism or as a fallback option when predefined topics are unable to address a user’s query.1
The Role of Public Websites as Knowledge Sources
Public websites represent a key external knowledge source type supported within Copilot Studio, enabling agents to search and present information derived from specific, designated URLs.1 When a user configures a public website as a knowledge source, they are required to provide the URL, a descriptive name, and a detailed description.2
For these designated public websites, Copilot Studio employs Bing to conduct searches based on user queries, ensuring that results are exclusively returned from the specified URLs.1 This targeted search functionality operates concurrently with a broader “Web Search” capability, which, if enabled, queries all public websites indexed by Bing.1 This dual search mechanism presents a significant consideration for risk exposure. Even if an organization meticulously selects and attests to owning a particular public website as a knowledge source, the agent’s responses may still be influenced by, or draw information from, other public websites not explicitly owned by the organization. This occurs if the general “Web Search” or “Allow the AI to use its own general knowledge” settings are active within Copilot Studio.1 This expands the potential surface for legal and compliance risks, as the agent’s grounding is not exclusively confined to the explicitly provided and attested URLs. Organizations must therefore maintain a keen awareness of these broader generative AI settings and manage them carefully to control the scope of external data access.
Knowledge Source Management and Prioritization
Copilot Studio offers functionalities for organizing and prioritizing knowledge sources, with a general recommendation to prioritize internal documents over public URLs due to their inherent reliability and the greater control an organization has over their content.11 A notable feature is the ability to designate a knowledge source as “official”.1 This designation is applied to sources that have undergone a stringent verification process and are considered highly trustworthy, implying that their content can be used directly by the agent without further validation.
This “Official source” flag is more than a mere functional tag; it functions as a de facto internal signal for trust and compliance. By marking a source as “official,” an organization implicitly certifies the accuracy, reliability, and, critically, the legal usability of its content. Conversely, refraining from marking a non-owned public website as official should serve as an indicator of higher inherent risk, necessitating increased caution and rigorous verification of the agent’s outputs. This feature can and should be integrated into an organization’s broader data governance framework, providing a clear indicator to all stakeholders regarding the vetting status of external information.
2. The “Website Ownership Attestation”: A Critical Requirement
Purpose of the Attestation
When incorporating a public website as a knowledge source within Copilot Studio, users encounter an explicit prompt requesting confirmation of their organization’s ownership of the website.1 Microsoft states that enabling this option “allows Copilot Studio to access additional information from the website to return better answers”.2 This statement suggests that the attestation serves as a mechanism to unlock enhanced indexing or deeper data processing capabilities that extend beyond standard public web crawling.
The attestation thus serves a dual purpose: it acts as a legal declaration that transfers the burden of compliance directly to the user, and it functions as a technical gateway. By attesting to ownership, the user implicitly grants Microsoft, and its underlying services such as Bing, permission to perform more extensive data access and processing on that specific website. Misrepresenting ownership in this context could lead to direct legal action from the actual website owner for unauthorized access or use. Furthermore, such misrepresentation could constitute a breach of Microsoft’s terms of service, potentially affecting the user’s access to Copilot Studio services.
Why Microsoft Requires this Confirmation
Microsoft’s approach to data sourcing for its general Copilot models demonstrates a cautious stance towards public data, explicitly excluding sources that are behind paywalls, violate policies, or have implemented opt-out mechanisms.12 This practice underscores Microsoft’s awareness of and proactive efforts to mitigate legal risks associated with public data.
For Copilot Studio, Microsoft clearly defines the scope of responsibility. It states that “Any agent you create using Microsoft Copilot Studio is your own product or service, separate and apart from Microsoft Copilot Studio. You are solely responsible for the design, development, and implementation of your agent”.7 This foundational principle is further reinforced by Microsoft’s general Terms of Use for its AI services, which explicitly state: “You are solely responsible for responding to any third-party claims regarding your use of the AI services in compliance with applicable laws (including, but not limited to, copyright infringement or other claims relating to content output during your use of the AI services)”.13 This legal clause directly mandates the user’s responsibility and forms the underlying rationale for the attestation requirement.
The website ownership attestation is a concrete manifestation of Microsoft’s shared responsibility model for AI. While Microsoft provides the secure platform and powerful generative AI capabilities, the customer assumes primary responsibility for the legality and compliance of the data they feed into their custom agents and the content those agents generate. This is a critical distinction from Microsoft’s broader Copilot offerings, where Microsoft manages the underlying data sourcing. For Copilot Studio users, the attestation serves as a clear legal acknowledgment of this transferred responsibility, making due diligence on external knowledge sources paramount.
3. Legal and Compliance Implications of Using Public Websites
3.1. Intellectual Property Rights and AI
Copyright Infringement Risks
Generative AI models derive their capabilities from processing vast quantities of data, which frequently includes copyrighted materials such as text, images, and articles scraped from the internet.4 The entire lifecycle of developing and deploying generative AI systems—encompassing data collection, curation, training, and output generation—can, in many instances, constitute a
prima facie infringement of copyright owners’ exclusive rights, particularly the rights of reproduction and to create derivative works.3
A significant concern arises when AI-generated outputs exhibit “substantial similarity” to the original training data inputs. In such cases, there is a strong argument that the model’s internal “weights” themselves may infringe upon the rights of the original works.3 The use of copyrighted material without obtaining the necessary licenses or explicit permissions can lead to costly lawsuits and substantial financial penalties for the infringing party.5 The legal risk extends beyond the initial act of ingesting data; it encompasses the potential for the AI agent to “memorize” and subsequently reproduce copyrighted content in its responses, leading to downstream infringement. The “black box” nature of large language models makes it challenging to trace the precise provenance of every output, placing a significant burden on the user to implement robust output monitoring and content moderation 6 to mitigate this complex risk effectively.
The “Fair Use” and “Text and Data Mining” Exceptions
The legal framework governing AI training on scraped data is complex and varies considerably across different jurisdictions.4 For instance, the United States recognizes a “fair use” exception to copyright law, while the European Union (EU) employs a “text and data mining” (TDM) exception.4
The United States Copyright Office (USCO) has issued a report that critically assesses common arguments for fair use in the context of AI training.3 This report explicitly states that using copyrighted works to train AI models is generally
not considered inherently transformative, as these models “absorb the essence of linguistic expression.” Furthermore, the report rejects the analogy of AI training to human learning, noting that AI systems often create “perfect copies” of data, unlike the imperfect impressions retained by humans. The USCO report also highlights that knowingly utilizing pirated or illegally accessed works as training data will weigh against a fair-use defense, though it may not be determinative.3
Relying on “fair use” as a blanket defense for using non-owned public websites as AI knowledge sources is becoming increasingly precarious. The USCO’s report significantly weakens this argument, indicating that even publicly accessible content is likely copyrighted, and its use for commercial AI training is not automatically protected. The global reach of Copilot Studio agents means that an agent trained in one jurisdiction might interact with users or data subject to different, potentially stricter, intellectual property laws, creating a complex jurisdictional landscape that necessitates a conservative legal interpretation and, ideally, explicit permissions.
Table: Key Intellectual Property Risks in AI Training
| Risk Category | Description in AI Context | Relevance to Public Websites in Copilot Studio | Key Sources |
|---|---|---|---|
| Copyright Infringement | AI models trained on copyrighted material may reproduce or create derivative works substantially similar to the original, leading to claims of unauthorized copying. | High. Content on most public websites is copyrighted. Using it for AI training without permission risks infringement of reproduction and derivative work rights. | 3 |
| Terms of Service (ToS) Violation | Automated scraping or use of website content for AI training may violate a website’s ToS, which are legally binding contracts. | High. Many public websites explicitly prohibit web scraping or commercial use of their content in their ToS. | 4 |
| Right of Publicity/Misuse of Name, Image, Likeness (NIL) | AI output generating or using individuals’ names, images, or likenesses without consent, particularly in commercial contexts. | Moderate. Public websites may contain personal data, images, or likenesses, the use of which by an AI agent could violate NIL rights. | 4 |
| Database Rights | Infringement of sui generis database rights (e.g., in the EU) that protect the investment in compiling and presenting data, even if individual elements are not copyrighted. | Moderate. If the public website is structured as a database, its use for AI training could infringe upon these specific rights in certain jurisdictions. | 4 |
| Trademarks | AI generating content that infringes upon existing trademarks, such as logos or brand names, from training data. | Low to Moderate. While less direct, an AI agent could inadvertently generate trademark-infringing content if trained on branded material. | 4 |
| Trade Secrets | AI inadvertently learning or reproducing proprietary information that constitutes a trade secret from publicly accessible but sensitive content. | Low. Public websites are less likely to contain trade secrets, but if they do, their use by AI could lead to misappropriation claims. | 4 |
3.2. Terms of Service (ToS) and Acceptable Use Policies
Violations from Unauthorized Data Use
Website Terms of Service (ToS) and End User License Agreements (EULAs) are legally binding contracts that govern how data from a particular site may be accessed, scraped, or otherwise utilized.4 These agreements often include specific provisions detailing permitted uses, attribution requirements, and liability allocations.4
A considerable number of public websites expressly prohibit automated data extraction, commonly known as “web scraping,” within their ToS. Microsoft’s own general Terms of Use, for example, explicitly forbid “web scraping, web harvesting, or web data extraction methods to extract data from the AI services”.13 This position establishes a clear precedent for their stance on unauthorized automated data access and underscores the importance of respecting similar prohibitions on other websites. The legal risks extend beyond statutory copyright law to contractual obligations established by a website’s ToS. Violating these terms can lead to breach of contract claims, which are distinct from, and can occur independently of, copyright infringement. Therefore, using a public website as a knowledge source without explicit permission or a clear license, particularly if it involves automated data extraction by Copilot Studio’s underlying Bing functionality, is highly likely to constitute a breach of that website’s ToS. This means organizations must conduct a meticulous review of the ToS for
every public website they intend to use, as a ToS violation can lead to direct legal action, website blocking, and reputational damage.
Implications of Using Content Against a Website’s ToS
Breaching a website’s Terms of Service can result in a range of adverse consequences, including legal action for breach of contract, the issuance of injunctions to cease unauthorized activity, and the blocking of future access to the website.
Furthermore, if content obtained in violation of a website’s ToS is subsequently used to train a Copilot Studio agent, and that agent’s output then leads to intellectual property infringement or further ToS violations, the Copilot Studio user is explicitly held “solely responsible” for any third-party claims.7 The common assumption that “public websites” are freely usable for any purpose is a misconception. The research consistently contradicts this, emphasizing copyright and ToS restrictions.3 The term “public website” in this context merely signifies accessibility, not a blanket license for its content’s use. For AI training and knowledge sourcing, organizations must abandon the assumption of free use and adopt a rigorous due diligence process. This involves not only understanding copyright implications but also meticulously reviewing the terms of service, privacy policies, and any explicit licensing information for every external URL. Failure to do so exposes the organization to significant and avoidable legal liabilities, as the attestation transfers this burden directly to the customer.
4. Microsoft’s Stance and Customer Protections
4.1. Microsoft’s Customer Copyright Commitment (CCC)
Scope of Protection for Copilot Studio
Effective June 1, 2025, Microsoft Copilot Studio has been designated as a “Covered Product” under Microsoft’s Customer Copyright Commitment (CCC).6 This commitment signifies that Microsoft will undertake the defense of customers against third-party copyright claims specifically related to content
generated by Copilot Studio agents.6 The protection generally extends to agents constructed using configurable Metaprompts or other safety systems, and features powered by Azure OpenAI within Microsoft Power Platform Core Services.6
Exclusions and Critical Limitations
Crucially, components powered by Bing, such as web search capabilities, are explicitly excluded from the scope of the Customer Copyright Commitment and are instead governed by Bing’s own terms.6 This “Bing exclusion” represents a significant gap in indemnification for public websites. The attestation for public websites is inextricably linked to Bing’s search functionality within Copilot Studio.1 Because Bing-powered components are
excluded from Microsoft’s Customer Copyright Commitment, any copyright claims arising from the use of non-owned public websites as knowledge sources are highly unlikely to be covered by Microsoft’s indemnification. This means that despite the broader CCC for Copilot Studio, the legal risk for content sourced from public websites not owned by the organization, via Bing search, remains squarely with the customer. The attestation serves as a clear acknowledgment of this specific risk transfer.
Required Mitigations for CCC Coverage (where applicable)
To qualify for CCC protection, for the covered components of Copilot Studio, customers are mandated to implement specific safeguards outlined by Microsoft.6 These mandatory mitigations include robust content filtering to prevent the generation of harmful or inappropriate content, adherence to prompt safety guidelines that involve designing prompts to reduce the risk of generating infringing material, and diligent output monitoring, which entails reviewing and managing the content generated by agents.6 Customers are afforded a six-month period to implement any new mitigations that Microsoft may introduce.6 These required mitigations are not merely suggestions; they are contractual prerequisites for receiving Microsoft’s copyright indemnification. For organizations, this necessitates a significant investment in robust internal processes for prompt engineering, content moderation, and continuous output review. Even for components
not covered by the CCC (such as Bing-powered public website search), these mitigations represent essential best practices for responsible AI use. Implementing them can significantly reduce general legal exposure and demonstrate due diligence, regardless of direct indemnification.
Table: Microsoft’s Customer Copyright Commitment (CCC) for Copilot Studio – Scope and Limitations
| Copilot Studio Component/Feature | CCC Coverage | Conditions/Exclusions | Key Sources |
|---|---|---|---|
| Agents built with configurable Metaprompts/Safety Systems | Yes | Customer must implement required mitigations (content filtering, prompt safety, output monitoring). | 6 |
| Features powered by Azure OpenAI within Microsoft Power Platform Core Services | Yes | Customer must implement required mitigations (content filtering, prompt safety, output monitoring). | 6 |
| Bing-powered components (e.g., Public Website Knowledge Sources) | No | Explicitly excluded; follows Bing’s own terms. | 6 |
4.2. Your Responsibilities as a Copilot Studio User
Adherence to Microsoft’s Acceptable Use Policy
Users of Copilot Studio are bound by Microsoft’s acceptable use policies, which strictly prohibit any illegal, fraudulent, abusive, or harmful activities.15 This explicitly includes the imperative to respect the intellectual property rights and privacy rights of others, and to refrain from using Copilot to infringe, misappropriate, or violate such rights.15 Microsoft’s general Terms of Use further reinforce this by prohibiting users from employing web scraping or data extraction methods to extract data from
Microsoft’s own AI services 13, a principle that extends to respecting the terms of other websites.
Importance of Data Governance and Data Loss Prevention (DLP) Policies
Administrators possess significant granular and tenant-level governance controls over custom agents within Copilot Studio, accessible through the Power Platform admin center.16 Data Loss Prevention (DLP) policies serve as a cornerstone of this governance framework, enabling administrators to control precisely how agents connect with and interact with various data sources and services, including public URLs designated as knowledge sources.16
Administrators can configure DLP policies to either enable or disable specific knowledge sources, such as public websites, at both the environment and tenant levels.16 These policies can also be used to block specific channels, thereby preventing agent publishing.16 DLP policies are not merely a technical feature; they are a critical organizational compliance shield. They empower administrators to enforce internal legal and ethical standards, preventing individual “makers” from inadvertently or intentionally introducing high-risk public data into Copilot Studio agents. This administrative control is vital for mitigating the legal exposure that arises from the “Bing exclusion” in the CCC and the general user responsibility for agent content. It allows companies to tailor their risk posture based on their specific industry regulations, data sensitivity, and overall risk appetite, providing a robust layer of defense.
5. Best Practices for Managing Public Website Knowledge Sources
Strategies for Verifying Website Ownership and Usage Rights
To effectively manage the risks associated with public website knowledge sources, several strategies for verification and rights management are essential:
- Legal Review of Terms of Service: A thorough legal review of the Terms of Service (ToS) and privacy policy for every single public website intended for use as a knowledge source is imperative. This review should specifically identify clauses pertaining to data scraping, AI training, commercial use, and content licensing. It is prudent to assume that all content is copyrighted unless explicitly stated otherwise.
- Direct Licensing and Permissions: Whenever feasible and legally necessary, organizations should actively seek direct, written licenses or explicit permissions from website owners. These permissions must specifically cover the purpose of using their content for AI training and subsequent output generation within Copilot Studio agents.
- Prioritize Public Domain or Openly Licensed Content: A strategic approach involves prioritizing the use of public websites whose content is demonstrably in the public domain or offered under permissive open licenses, such as Creative Commons licenses. Strict adherence to any associated attribution requirements is crucial.
- Respect Technical Directives: While not always legally binding, adhering to robots.txt directives and other machine-readable metadata that indicate a website’s preferences regarding automated access and data collection demonstrates good faith and can significantly reduce the likelihood of legal disputes.
Given the complex and evolving legal landscape of AI and intellectual property, proactive legal due diligence on every external URL is no longer merely a best practice; it has become a fundamental, non-negotiable requirement for responsible AI deployment. This shifts the organizational mindset from “can this data be accessed?” to “do we have the explicit legal right to use this specific data for AI training and to generate responses from it?” Ignoring this foundational step exposes the organization to significant and potentially unindemnified legal liabilities.
Considerations for Using Non-Owned Public Data
Even with careful due diligence, specific considerations apply when using non-owned public data:
- Avoid Sensitive/Proprietary Content: Exercise extreme caution and, ideally, avoid using public websites that contain highly sensitive, proprietary, or deeply expressive creative works (e.g., unpublished literary works, detailed financial reports, or personal health information). Such content should only be considered if explicit, robust permissions are obtained and meticulously documented.
- Implement Robust Content Moderation: Configure content moderation settings within Copilot Studio 1 to filter out potentially harmful, inappropriate, or infringing content from agent outputs. This serves as a critical last line of defense against unintended content generation.
- Clear User Disclaimers: For Copilot Studio agents that utilize external public knowledge sources, it is essential to ensure that clear, prominent disclaimers are provided to end-users. These disclaimers should advise users to exercise caution when considering answers and to independently verify information, particularly if the source is not designated as “official” or is not owned by the organization.1
- Strategic Management of Generative AI Settings: Meticulously manage the “Web Search” and “Allow the AI to use its own general knowledge” settings 1 within Copilot Studio. This control limits the agent’s ability to pull information from the broader internet, ensuring that its responses are primarily grounded in specific, vetted, and authorized knowledge sources. This approach significantly reduces the risk of unpredictable and potentially infringing content generation.
A truly comprehensive risk mitigation strategy requires a multi-faceted approach that integrates legal vetting with technical and operational controls. Beyond the initial legal assessment of data sources, configuring in-platform features like content moderation, carefully managing the scope of generative AI’s general knowledge, and providing clear user disclaimers are crucial operational measures. These layers work in concert to reduce the likelihood of infringing outputs and manage user expectations regarding the veracity and legal standing of information derived from external, non-owned sources, thereby strengthening the organization’s overall compliance posture.
Implementing Internal Policies and User Training
Effective governance of AI agents requires a strong internal framework:
- Develop a Comprehensive Internal AI Acceptable Use Policy: Organizations should create and enforce a clear, enterprise-wide acceptable use policy for AI tools. This policy must specifically address the use of external knowledge sources in Copilot Studio and precisely outline the responsibilities of all agent creators and users.15 The policy should clearly define permissible types of external data and the conditions under which they may be used.
- Mandatory Training for Agent Makers: Providing comprehensive and recurring training to all Copilot Studio agent creators is indispensable. This training should cover fundamental intellectual property law (with a focus on copyright and Terms of Service), data governance principles, the specifics of Microsoft’s Customer Copyright Commitment (including its exclusions), and the particular risks associated with using non-owned public websites as knowledge sources.15
- Leverage DLP Policy Enforcement: Actively utilizing the Data Loss Prevention (DLP) policies available in the Power Platform admin center is crucial. These policies should be configured to restrict or monitor the addition of public websites as knowledge sources, ensuring strict alignment with the organization’s defined risk appetite and compliance requirements.16
- Regular Audits and Review: Establishing a process for regular audits of deployed Copilot Studio agents, their configured knowledge sources, and their generated outputs is vital for ensuring ongoing compliance with internal policies and external regulations. This proactive measure aids in identifying and addressing any unauthorized or high-risk data usage.
Effective AI governance and compliance are not solely dependent on technical safeguards; they are fundamentally reliant on human awareness, behavior, and accountability. Comprehensive training, clear internal policies, and robust administrative oversight are indispensable to ensure that individual “makers” fully understand the legal implications of their actions within Copilot Studio. This human-centric approach is vital to prevent inadvertent legal exposure and to foster a culture of responsible AI development and deployment within the organization, complementing technical controls with informed human decision-making.
Conclusion and Recommendations
Summary of Key Concerns
The “website ownership attestation” in Microsoft Copilot Studio, when adding public websites as knowledge sources, represents a significant legal declaration. This attestation effectively transfers the burden of intellectual property compliance for designated public websites directly to the user. The analysis indicates that utilizing non-owned public websites as knowledge sources for Copilot Studio agents carries substantial and largely unindemnified legal risks, primarily copyright infringement and Terms of Service violations. This is critically due to the explicit exclusion of Bing-powered components, which facilitate public website search, from Microsoft’s Customer Copyright Commitment. The inherent nature of generative AI, which learns from vast datasets and possesses the capability to produce “substantially similar” outputs, amplifies these legal risks, making careful data sourcing and continuous output monitoring imperative for organizations.
Actionable Advice and Recommendations
To navigate these complexities and mitigate potential legal exposure, the following actionable advice and recommendations are provided for organizations utilizing Microsoft Copilot Studio:
- Treat the Attestation as a Legal Oath: It is paramount to understand that checking the “I own this website” box constitutes a formal legal declaration. Organizations should only attest to ownership for websites that they genuinely own, control, and for which they possess the full legal rights to use content for AI training and subsequent content generation.
- Prioritize Owned and Explicitly Licensed Data: Whenever feasible, organizations should prioritize the use of internal, owned data sources (e.g., SharePoint, Dataverse, uploaded proprietary files) or external content for which clear, explicit licenses or permissions have been obtained. This approach significantly reduces legal uncertainty.
- Conduct Rigorous Legal Due Diligence for All Public URLs: For any non-owned public website being considered as a knowledge source, a meticulous legal review of its Terms of Service, privacy policy, and copyright notices is essential. The default assumption should be that all content is copyrighted, and its use should be restricted unless explicit permission is granted or the content is unequivocally in the public domain.
- Leverage Administrative Governance Controls: Organizations must proactively utilize the Data Loss Prevention (DLP) policies available within the Power Platform admin center. These policies should be configured to restrict or monitor the addition of public websites as knowledge sources, ensuring strict alignment with the organization’s legal and risk tolerance frameworks.
- Implement a Comprehensive AI Governance Framework: Establishing clear internal policies for responsible AI use, including specific guidelines for external data sourcing, is critical. This framework should encompass mandatory and ongoing training for all Copilot Studio agent creators on intellectual property law, terms of service compliance, and the nuances of Microsoft’s Customer Copyright Commitment. Furthermore, continuous monitoring of agent outputs and knowledge source usage should be implemented.
- Strategically Manage Generative AI Settings: Careful configuration and limitation of the “Web Search” and “Allow the AI to use its own general knowledge” settings within Copilot Studio are advised. This ensures that the agent’s responses are primarily grounded in specific, vetted, and authorized knowledge sources, thereby reducing reliance on broader, unpredictable public internet searches and mitigating associated risks.
- Provide Transparent User Disclaimers: For any Copilot Studio agent that utilizes external public knowledge sources, it is imperative to ensure that appropriate disclaimers are prominently displayed to end-users. These disclaimers should advise users to consider answers with caution and to verify information independently, especially if the source is not marked as “official” or is not owned by the organization.
Works cited
- Knowledge sources overview – Microsoft Copilot Studio, accessed on July 3, 2025, https://learn.microsoft.com/en-us/microsoft-copilot-studio/knowledge-copilot-studio
- Add a public website as a knowledge source – Microsoft Copilot Studio, accessed on July 3, 2025, https://learn.microsoft.com/en-us/microsoft-copilot-studio/knowledge-add-public-website
- Copyright Office Weighs In on AI Training and Fair Use, accessed on July 3, 2025, https://www.skadden.com/insights/publications/2025/05/copyright-office-report
- Legal Issues in Data Scraping for AI Training – The National Law Review, accessed on July 3, 2025, https://natlawreview.com/article/oecd-report-data-scraping-and-ai-what-companies-can-do-now-policymakers-consider
- The Legal Risks of Using Copyrighted Material in AI Training – PatentPC, accessed on July 3, 2025, https://patentpc.com/blog/the-legal-risks-of-using-copyrighted-material-in-ai-training
- Microsoft Copilot Studio: Copyright Protection – With Conditions – schneider it management, accessed on July 3, 2025, https://www.schneider.im/microsoft-copilot-studio-copyright-protection-with-conditions/
- Copilot Studio overview – Learn Microsoft, accessed on July 3, 2025, https://learn.microsoft.com/en-us/microsoft-copilot-studio/fundamentals-what-is-copilot-studio
- Microsoft Copilot Studio | PDF | Artificial Intelligence – Scribd, accessed on July 3, 2025, https://www.scribd.com/document/788652086/Microsoft-Copilot-Studio
- Copilot Studio | Pay-as-you-go pricing – Microsoft Azure, accessed on July 3, 2025, https://azure.microsoft.com/en-in/pricing/details/copilot-studio/
- Add knowledge to an existing agent – Microsoft Copilot Studio, accessed on July 3, 2025, https://learn.microsoft.com/en-us/microsoft-copilot-studio/knowledge-add-existing-copilot
- How can we manage and assign control over the knowledge sources – Microsoft Q&A, accessed on July 3, 2025, https://learn.microsoft.com/en-us/answers/questions/2224215/how-can-we-manage-and-assign-control-over-the-know
- Privacy FAQ for Microsoft Copilot, accessed on July 3, 2025, https://support.microsoft.com/en-us/topic/privacy-faq-for-microsoft-copilot-27b3a435-8dc9-4b55-9a4b-58eeb9647a7f
- Microsoft Terms of Use | Microsoft Legal, accessed on July 3, 2025, https://www.microsoft.com/en-us/legal/terms-of-use
- AI-Generated Content and IP Risk: What Businesses Must Know – PatentPC, accessed on July 3, 2025, https://patentpc.com/blog/ai-generated-content-and-ip-risk-what-businesses-must-know
- Copilot privacy considerations: Acceptable use policy for your bussines – Seifti, accessed on July 3, 2025, https://seifti.io/copilot-privacy-considerations-acceptable-use-policy-for-your-bussines/
- Security FAQs for Copilot Studio – Learn Microsoft, accessed on July 3, 2025, https://learn.microsoft.com/en-us/microsoft-copilot-studio/security-faq
- Copilot Studio security and governance – Learn Microsoft, accessed on July 3, 2025, https://learn.microsoft.com/en-us/microsoft-copilot-studio/security-and-governance
- A Microsoft 365 Administrator’s Beginner’s Guide to Copilot Studio, accessed on July 3, 2025, https://practical365.com/copilot-studio-beginner-guide/
- Configure data loss prevention policies for agents – Microsoft Copilot Studio, accessed on July 3, 2025, https://learn.microsoft.com/en-us/microsoft-copilot-studio/admin-data-loss-prevention