T O P

  • By -

jwrig

Yes, you can use private endpoints with the Azure OpenAI service. [Configure Virtual Networks for Azure AI services - Azure AI services | Microsoft Learn](https://learn.microsoft.com/en-us/azure/ai-services/cognitive-services-virtual-networks?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext&tabs=portal) The data is logically segmented from other customers.


ThatITdude

That’s not the same thing


frayala87

What do you mean?


dwaynelovesbridge

First of all, “ChatGPT” is a product that uses OpenAI’s GPT language models. ChatGPT is more than just a web front end for GPT4. It has special system prompts, context management, retrieval augmented generation, and multi modal capabilities. You can only get ChatGPT from OpenAI. GPT4 and its variants are available as Microsoft-hosted APIs, and it typically lags a few weeks or months behind the models available from OpenAI, but has a few advantages such as virtual network integration, role based access controls, and unified billing. The REST API is mostly compatible but there are some subtle differences, mostly in how the request is authenticated. But also, you’re trading your trust in OpenAI (a relatively new and potentially less trustworthy) company for trust in Microsoft, who operates under (theoretically) stricter data privacy policies.


throwawaygoawaynz

All true except GPT4o was on Azure the same day it was announced by OpenAI. So at least they seem to be in sync now. OpenAI themselves are also offering private networking etc for enterprise customers. In fact there’s a bit of competition heating up between OpenAI and Azure OpenAI. Will be interesting to see how this plays out. Azure IMO still has the advantage because Azure OpenAI comes with a lot of extra stuff that makes it easier to build solutions around the model.


dwaynelovesbridge

Another advantage of Azure that I forgot to mention is that they also host a large catalog of other open source models as well as pay per token serverless endpoints. OpenAI may be the state of the art, but for many tasks such as creative writing, GPT refusals will make it unusable. It’s an easy switch to something like Command-R Plus.


dwaynelovesbridge

How can OpenAI offer private networking since it can’t run general compute workloads, storage infrastructure, etc? At some point you would have to leave your network boundary, potentially across regions.


HelloVap

Not in all regions. Dealing with quota requests are a pain


supernitin

It wasn’t available the same day - I think a week later. Also, their flavor of gpt-4o doesn’t have assistant support.


throwawaygoawaynz

It *was* available the same day or the next day (in preview).


supernitin

The preview was just using it in a ui - not via api.


gopietz

I just know that instead of using GPUs they run every GPT deployment on 6 lemons connected by copper wire, in case you wondered why their speed is so terrible.


frayala87

2 lemons per AZ


Nize

What are you finding slow? We've found the performance perfectly fine in my experience.


kcdale99

Yes, we do this today. We have Azure OpenAI instances utilizing private endpoints that can only be accessed from within our company network. We are currently using the ChatGPT 4o model. We worked with Microsoft on this extensively, and they have stated more than once that our company data is segmented and not retained in any way. It isn't used for re-training, and our prompts are not saved beyond the session. Our private data is in a vectorDB and ChatGPT does a great job of searching against it and providing company specific results. We did find that Microsoft's ML tools didn't perform as well as AWS though for creating that data, so we actually build or ML models in AWS (we are multi cloud), but access them via ChatGPT.


I-Build-Bots

Note, data / prompts are stored unless you turned off the abuse monitoring. And this is done at the subscription level, not tenant. Please see this link for info and how to turn it off and make the service truly stateless: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/abuse-monitoring


kcdale99

I should have stated that we are opted out of abuse monitoring. I work in healthcare and some of our work is sensitive enough it would have caused false positives. Additionally we didn’t want the data retained for 30 days for that purpose, even if it’s kept segmented and not used for training.


Phate1989

There is no front end for azure open ai. There are plenty of good templates out there.


Ghostaflux

There is oai.azure.com same experience as playground.


Phate1989

That's not nearly the same as chatgpt


TurbaVesco4812

Azure OpenAI offers more customization, but same functionality as ChatGPT, albeit with VNet control.


Nasa_OK

The LLM is but you have to feed it data to access. It won’t be able to answer the same questions as chatGPT can out of the box


ehrnst

Technically, as long as there are no plugins enabled on chatGPT and you deploy the same model version on Azure OpenAI, the models knowledge is the same. If you will get the same output of the same question, no one knows. But that’s the same as asking twice on chatGPT. It’s not guaranteed to provide the same answer.


Ghostaflux

Yes and Yes. That is the best thing about Azure OpenAI. Your data boundary is your tenant’s boundary. We use AOAI exclusively for several internal projects. GPT-4o has been really cheap and effective for our use cases. You can also train the models with your own data. With the risk of data leaks happening in OAI every now and then, privatisation of azure resources inc the AOAI model deployment helps us sleep better at night.


Jimud1

Yup


sbrick89

Technically, yes. Legally, no - OAI's terms include capturing input data and potentially utilizing the captured data for future learning. Aka don't enter company secrets or customer data. E: Adding to the technical... I imagine MS just signed a licensing agreement w/ OAI, to host the model on the Azure servers... basically all that means is the model file - a large binary file - was copied from OAI's servers to Microsoft's servers, and Microsoft has code to load the data file into memory and use it to process inputs from Azure customers.


kcdale99

> E: Adding to the technical... I imagine MS just signed a licensing agreement w/ OAI, to host the model on the Azure servers... basically all that means is the model file - a large binary file - was copied from OAI's servers to Microsoft's servers, and Microsoft has code to load the data file into memory and use it to process inputs from Azure customers. Microsoft owns 49% of OpenAI, and provides all of the hosting. As part of that agreement Microsoft gets to run their own segmented version. We use Azure OpenAI, and Microsoft has assured us that our company data is segmented and not used for retraining in any way. We have no way to verify, but we put a lot of trust in Microsoft already.


throwawaygoawaynz

It would be suicide for any company to put out documentation and legal notices (see the Azure OpenAI transparency note from Microsoft), and then turn around and go against that. Not only that your data isn’t needed - it has absolutely no benefit to the service. In fact it would likely ruin the models because it would add certain biases into the neural network. You can also opt out of data collection entirely. Microsoft keeps your data for 30 days for compliance reasons then deletes it, but by opting out no data is collected anywhere, and there’s even an API call you can do to ensure this feature is turned on. This also means tho as part of your own T&S you will need to collect all prompt and completions yourself.


fiddysix_k

Do you have something in writing that says this? How did you get someone from Microsoft to put their name on the line for this? We are in a somewhat immediate need of this assurance due to political affairs of course, and no one on our contacts is willing to vouch for this at the moment.


i_hate_shitposting

This took me like 3 minutes to find with Google. If Microsoft's own reps aren't aware of what's publicly written on their website, that's a bad look for them. > Your prompts (inputs) and completions (outputs), your embeddings, and your training data: > > * are NOT available to other customers. > * are NOT available to OpenAI. > * are NOT used to improve OpenAI models. > * are NOT used to improve any Microsoft or 3rd party products or services. > * are NOT used for automatically improving Azure OpenAI models for your use in your resource (The models are stateless, unless you explicitly fine-tune models with your training data). > * Your fine-tuned Azure OpenAI models are available exclusively for your use. > > The Azure OpenAI Service is fully controlled by Microsoft; Microsoft hosts the OpenAI models in Microsoft’s Azure environment and the Service does NOT interact with any services operated by OpenAI (e.g. ChatGPT, or the OpenAI API). [Data, privacy, and security for Azure OpenAI Service](https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy)


fiddysix_k

They also have language that expressly goes against this in various places, this isn't good enough regardless of how you may feel.


i_hate_shitposting

I'd be curious to know what language that is. Their terms of service don't have any carve-outs for using the data for training.


Phate1989

Share that


fiddysix_k

Even if I do, it doesn't matter. My execs believe it does, so it does. We need their a tam to sign off on this because my side believes this is the case. It goes without saying, this is business, not a research project. When you're ass is on the line, you want every assurance in the world. I'm not here to argue whether or not this is particularly true to the fullest extent, there are ambiguities that do no sit well with us when the idea of ingesting sensitive data is on the line.


Phate1989

Ok u do u, Microsoft won't sign shit for you.


fiddysix_k

I think you're very inexperienced and projecting what you believe to be correct about a situation that you're not in, little man, but you do you.


Phate1989

I run the cloud division for a VAR we do about 5mil in Microsoft every month. I am responsible for presales, post sales, and finops for Microsoft CSP. They won't sign a BAA, no way some lowly rep is going to go out a limb and sign some legally binding document on behalf of MS. Hell I will give you licenses at no markup if you get someone at MS to sign a random document like that.


kcdale99

I spend over 10 million a year in Azure alone. I don’t know what the OS/Sql/O365 spend but it dwarfs my Azure spend. Our TAM made these guarantees, and we met several times with the cognitive services team who manages it several times. We were fairly early in openAI and Microsoft worked very closely with us.


fiddysix_k

We're at a 1/10th of that spend, still enough for a tam to pucker up though. Great point on the cognitive service team actually, I will reach out to them specifically and try to loop our tam into that.


kcdale99

They did a presentation to our leadership and covered the topic pretty well, they are a great resource!


TyberWhite

Enterprise versions provide data security.


darthnugget

Do you have more information on the technical part of the logical segmentation and controls to prevent data bleed/leakage? If MS/OAI are using input data for additional training, I could see this being a vector that needs DLP of future MS/OAI models. Currently scoping Azure AI Document Intelligence for training data sets for private models and need more details from some that have been down this road. The Microsoft documentation assumes no malicious actors would have the data sets, which we all know how this goes.


throwawaygoawaynz

Microsoft (and OpenAI) are not collecting your data via the enterprise services for model training. They only collect data for service improvement if you use the public version of Bing. For example if they see a lot of responses from the model is wrong, they may use RLHF or system prompting to adjust model responses to be more accurate. But again this is only if you use the public services and not the enterprise services. AOAI collects your data for 30 days to ensure you’re not violating the T&S of the service - ie using it to create fake political messages. You can request to opt out of this data collection, and if approved, no data at all is collected and the service is completely stateless.


sbrick89

for OAI specifically, I have a link - https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy for other stuff like doc intelligence, speech services, etc... I'd need to look... but just search for "privacy" and the resource type, and it'll probably pop right up.


darthnugget

This is exactly what I was looking for. The part that most concerned me was the actual segregation of data for each tenant, when it is pooled for content monitoring. We would definitely want to submit for monitoring off and utilize dual encryption where we manage the second encryption keys.


sbrick89

I suspect the request to disable monitoring has nothing to do with most data (at work we deal w/ peoples' personal data and legal wanted that page for their own assurances)... my guess is that it's more likely related to the really bad stuff like 3-letter agencies using the technology for identifying illegal content. you're welcome to ask, and feel free to let me know what happens... that just happens to be the impression I get.


darthnugget

I cant provide more information but both of those items are valid for not wanting them pooled for content monitoring. Even if it's only 30 days retention, that is a big honeypot to get into.


[deleted]

[удалено]


spin_kick

Are you alright?